Efectos del apoyo social sobre la salud

3 ENFERMEDAD DE CROHN

ANTICUERPOS MONOCLONALES ANTI-TNF

5. APOYO SOCIAL

5.5. Efectos del apoyo social sobre la salud

4.2.1 The MUSTARD algorithm

Putting together the ideas of multi-step inertial methods and variable metric ones, we here propose a new operator splitting algorithm: variable metic MUlti-Step inerTial operAtoR splitting methoD, which is dubbed as “MUSTARD” for solving (Pinc) and (Popt). The details of the algorithm for solving

Chapter 4 4.2. Variable metric multi-step inertial operator splitting Algorithm 7: The MUSTARD algorithm

Initial: _{s ∈ N}₊ and_S def= {0, ...,s − 1}, letMν as defined in (4.1.5). x0 ∈ H, x−i = x0,i ∈ S. Choose , > 0 such that ≤ 2βν − .

repeat

Let {a_i,k}_i∈S, {b_i,k}_i∈S ∈ ] − 1, 2]s,V_k∈M_ν and γ_k∈ [, 2βν − ]:      

ya,k = xk+P_i∈Sai,k(xk−i − xk−i−1), yb,k= xk+Pi∈Sbi,k(xk−i − xk−i−1), xk+1 =J_γ_k_V−1 k A ya,k − γ_kV−1_k B(yb,k). (4.2.1) k = k + 1; until convergence;

Remark 4.2.1. The main reasons of considering the monotone inclusion (Pinc) instead of the opti-

mization problem (Popt), and involving variable metric are as followings:

(i) Problem (Popt) is only a special case of (Pinc). In particular, for various operators splitting

methods (e.g. non-relaxed DR, GFB, and Primal–Dual splitting), their fixed-point iterations can be written as certain monotone inclusion problems, while the involved monotone operators are not sub-differential of convex functions.

(ii) As we have seen from the examples of Primal–Dual splitting methods (e.g. (2.4.14) in Section 2.4.3.3 and (3.4.17) in Section 3.4.3), their corresponding monotone inclusion formulations are under different metric (i.e. metricV for the two examples).

As a result, considering the monotone inclusion problem and involving the variable metric allow us to extend the MUSTARD algorithm to a broad class of operator splitting methods.

Relation to previous work By form, the MUSTARD Algorithm 7 is the most general variable metric Forward–Backward splitting we are aware of, it is brand new to the literature for _{s ≥ 2. For} the case _{s = 1, it is more general than Algorithm}6 as variable metric is considered, and recovers the variable metric Forward–Backward proposed in [62] if_{s = 0.}

If we choose_{s = 1 and} V_k = Id, then based on the choice of the inertial parameters ak and bk, the relations between Algorithm7 with the aforementioned work are as following,

• a_k= 0, bk = 0: this is the original FB method [119,140].

• a_k∈ [0, ¯a], bk= 0: this is the case studied in [131] for (Pinc). In the context of optimization with

R = 0, one recovers the heavy ball method [146].

• a_k∈ [0, ¯a], bk= ak: this corresponds to the work of [120] for solving (Pinc). If moreover restrict

γk ∈]0, β] and let ak → 1, then Algorithm6 specializes to FISTA-type methods [24,50,13,12] developed for optimization.

• a_k∈ [0, ¯a], bk ∈]0, ¯b], ak6= bk: the general inertial FB scheme Algorithm6.

Below we also highlight the several important characteristics of the MUSTARD algorithm.

(i) Similarly to some existing inertial methods [131,120,115], the choice of the step-size γ_kallowed by MUSTARD is ]0, 2β[ if V_k≡ Id;

(ii) The algorithm allows multiple steps, which is characterized by_{s. In particular for s = 2, we show} that very promising practical results can be obtained (see Section4.5).

(iii) We allow to use negative inertial parameters. For_{s = 1, the inertial parameters should be positive} and lie in [0, 1[ to ensure convergence. However, fors ≥ 2, we will show that one can benefit from negative choices of the inertial parameters. In particular, for s = 2, the numerical experiments

of Section4.5 implies that a good choice of the inertial parameters should be a0,k, b0,k ∈]0, 2] and a1,k, b1,k ∈] − 1, 0].

Such an inertial setting can be investigated through the dynamical system perspective, see below for a short introduction.

(iv) As the problem we are considering is the general monotone inclusion problem, we can generalize the MUSTARD algorithm to the methods whose iterations are related to some monotone inclusion problem, for example the non-relaxed DR, GFB and several Primal–Dual splitting methods. MUSTARD as a discretised dynamical system Now consider the metric free case of MUSTARD algorithm for the optimization problem (P_opt) with V = Id. Consider the following second-order dynamical system

x(t) + c0(t) ˙x(t) + (∂R + ∇F )(x(t)) = 0, (4.2.2) where c₀(t) ≥ 0 is an asymptotically vanishing viscous damping function. Typically, c0(t) moderately decreases to 0, i.e. lim_t→+∞c0(t) = 0 and

tc0(t) = +∞.

Let 0 < ω₂ < ω1 be two weights such that ω1+ ω2 = 1, h > 0 be the time step-size, tk = kh and xk = x(tk). Consider an implicit (Euler backward) discretization w.r.t. ∂R and an explicit (Euler forward) discretization w.r.t. ∇F , and a weighted sum of explicit and implicit discretization of ¨x(t), i.e.

0 ∈ ω1

h2(xk+1− 2xk+ xk−1) +_hω22(xk− 2xk−1+ xk−2) + c0(kh)

h (xk− xk−1) + ∂R(xk+1) + ∇F (yb,k), then we obtain the following inclusion

xk+ a0,k(xk− xk−1) + a1,k(xk−1− xk−2) − γ∇F (yb,k) ∈ xk+1+ γ∂R(xk+1), (4.2.3) where we have a0,k = 1 − ω2 ω1 − hc0(kh) ω1 , a1,k = ω2 ω1 and γ = h 2 ω1. If we moreover set yb,k= xk+ b0,k(xk− xk−1) + b1,k(xk−1− xk−2),

with b0,k, b1,k being properly chosen, then we obtain the MUSTARD scheme for the case s = 2 and Vk≡ Id.

If we choose c₀(kh) = d

kh, d > 3, then (4.2.3) simplifies to the following inclusion

xk+ 1 − _ωd 1k (xk− xk−1) −ω_ω2 1 (xk− 2xk−1+ xk−2) − γ∇F (yb,k) ∈ xk+1+ γ∂R(xk+1). If we further let ω1= 1, ω2= 0 and yb,k= xk+ bk(xk− xk−1), bk ∈ [0, 1],

then we recover a special case of Algorithm6. If one moreover sets b_k = ak= (1 − d/k), then we obtain the FISTA scheme as studied in [160,11,50].

4.2.2 Global convergence of MUSTARD

In this section, we present the global convergence analysis for the MUSTARD algorithm. We summarize our results as follows:

(i) _{s ≥ 2 : In Theorem} 4.2.3, we establish conditional convergence of the sequence {x_k}_k∈N for fixed metric V, where the terminology “conditional convergence” refers to the fact that for the convergence of the sequence to occur, the sequences {a_i,k}_i∈S, {bi,k}i∈S has to be chosen depending (conditionally) on the sequence {x_k}_k∈N in such a way that an appropriate condition holds, e.g. (4.2.5). Unfortunately, so far for the cases ≥ 2 we only have a result for the case of fixed metric Vk≡V. However, it is sufficient to cover many algorithms of interest.

Chapter 4 4.2. Variable metric multi-step inertial operator splitting (ii) _{s = 1 :}

(a) in Theorem 4.2.5 we manage to prove conditional convergence of {x_k}_k∈N for a variable metric V_k.

(b) We also devise choices of the inertial parameters and metrics that are independent of {x_k}_k∈N and still guarantee global convergence (see Theorem 4.2.9). We dub this unconditional convergence.

All the proofs of the above results are gathered in Section4.6.

For the sake of generality, we consider the inexact version of the MUSTARD algorithm. The following definition is needed.

Definition 4.2.2 (ε-enlargement). Let A : H_{⇒ H be a set-valued maximal monotone operator and} ε ≥ 0. Then the ε-enlargement of A is defined by,

Aε(x)def= v ∈ H, hu − v, y − xi ≥ −ε, ∀y ∈ H, u ∈ A(y) .

From the definition, it is easy to verify that for 0 ≤ ε1 ≤ ε2we have Aε1(x) ⊂ Aε2(x) and A0(x) = A(x). Thus Aε is an enlargement of A.

For the updating step of xk+1 in (4.2.1), consider the following inexact form ya,k− γk V−1_k B(yb,k) + ξk − xk+1 ∈ γkV−1_k Aεk(xk+1),

where ξk ∈ H is the error in the evaluation of the gradient operator B, and εkis the enlargement error. Then we obtain the inexact form of MUSTARD

ya,k= xk+P_i∈Sai,k(xk−i − xk−i−1), yb,k= xk+P_i∈Sbi,k(xk−i− xk−i−1), ya,k− γk(V−1k B(yb,k) + ξk) − xk+1∈ γkV−1k A

εk_(x

k+1).

(4.2.4)

4.2.2.1 Conditional convergence

We present first the conditional convergence of the inexact MUSTARD algorithm. For each _{i ∈ S,} define ζ_i,k def= ai,k−γ_2βνkbi,k and ai

def

= sup_k∈N|a_i,k|.

Theorem 4.2.3 (Conditional convergence_{s ≥ 2). For the inexact MUSTARD iteration (}4.2.4), let conditions (A.1)-(A.3) hold, fix the metricV_k ≡V ∈ Mν, and let ξk ≡ 0. Suppose that the following two conditions hold

(i) the error {εk}k∈N∈ `1+;

(ii) the inertial parameters {a_i,k}_i∈S are such that P

i∈Sai < 1.

Then the generated sequence {xk}k∈Nis bounded. If moreover the following summability condition holds

P

k∈Nmax max_i∈S ζ 2

i,k, max_i∈S |bi,k|, max i∈S |ai,k|

P

i∈S ||xk−i − xk−i−1|| 2

< +∞, (4.2.5) then there exists an x?_{∈ zer(A + B) such that the sequence {x}

k}k∈N weakly converges to x?. The proof of the theorem can be found in Section4.6 from page83.

Remark 4.2.4. If the inertial parameters {a_i,k}_i∈S, {b_i,k}_i∈S are chosen in [0, 1] such that ζ_i,k2 = ai,k− γkb_2βνi,k

2 ≤ a2_i,k then condition (4.2.5) simplifies to

P

k∈Nmax max_i∈S |bi,k|, max_i∈S |ai,k|

P

i∈S||xk−i− xk−i−1|| 2

Condition (4.2.5) can be enforced by a simple online updating rule such as, for each _{i ∈ S, given} ai, bi ∈ [0, 1],

a_i,k = minai, ca,i,k , bi,k = minbi, cb,i,k , (4.2.6) where c_a,i,k, c_b,i,k> 0, and max{c_a,i,k, c_b,i,k}P

i∈S||xk−i− xk−i−1||2 is summable. For instance, one can choose

c_a,i,k = ca,i

k1+δP

i∈S||xk−i− xk−i−1||

2, ca,i > 0, δ > 0, and similarly for cb,k.

Whens = 1, then we have the following theorem with a variable metricVk. To lighten the notations, for s = 1, we denote ak = a0,k and bk = b0,k.

Theorem 4.2.5 (Conditional convergence _{s = 1). For the inexact MUSTARD iteration (}4.2.4), let conditions (A.1)-(A.3) hold. Suppose that the following conditions are satisfied

(i) For the metric sequence {V_k}_k∈N ∈ M_ν with ν > 0, suppose that there exists a non-negative sequence {η_k}_k∈N∈ `1

+ such that µ = sup k∈N

||V_k|| < +∞ and (1 + ηk)Vk < Vk+1. (ii) the inertial parameter is a_k ∈ [0, 1] such that ¯cdef= sup_k∈Nak(1 + ηk−1) < 1, and

sup k∈N 1 1 − ¯c

P

k m=1 (1 − ¯ck−m+1_)η m 1 + ηm < 1. (4.2.7)

(iii) the errors {εk}k∈N∈ `1+ and {||ξk||}k∈N∈ `1+.

Then the generated sequence {xk}k∈Nis bounded. If moreover the following summability condition holds

P

k∈Nmax{ak, bk}||xk− xk−1||

2 _{< +∞,} _(4.2.8)

then there exists an x?∈ zer(A + B) such that the sequence {x_k}_k∈N weakly converges to x?. The proof of the theorem can be found in Section4.6 from page86.

Remark 4.2.6.

(i) If the sequence {η_k}_k∈N satisfies P

k∈Nηk < 1 − ¯c, then condition (4.2.7) is in force. Given ¯

c ∈ [0, 1[, let δ, κ > 0, and set ηk as

ηk= _k1+δκ .

Then for fixed δ, (4.2.7) can be met with a proper choice of κ;

(ii) If ak ≥ bk, then (4.2.8) recovers the conditions in [131,120] for the conditional convergence of {x_k}_k∈N.

An empirical choice of the inertial parameters We introduce two empirical ways to set up the inertial parameters. For the sake of simplicity, let V_k ≡ Id, hence ν = 1. Consider the constant parameter setting,

γ ∈]0, 2β[ and bi = ai ∈] − 1, 2[, i ∈ S. Moreover, let (a_i)_i∈S be monotone non-increasing, i.e. a₀≥ a₁ ≥ · · · ≥ a_s−1.

Summarizing from multi-step inertial Forward–Backward and gradient descent, we obtain the following two empirical bounds for the summandP

i∈Sai: “Upper bound 1”: P iai ∈0, min 1,2β − γ_γ , “Upper bound 2”: P iai ∈0, min 1,_{2|β − γ|}2β − γ . (4.2.9)

In practice, to ensure the convergence of the generated sequence {x_k}_k∈N, these two bounds should be applied together with the online updating rule of inertial parameters (4.2.6). Most of the time, with proper choice of each a_i, (4.2.6) may never be triggered.

Chapter 4 4.2. Variable metric multi-step inertial operator splitting Remark 4.2.7. Compare (4.2.9) with (ii) of Theorem 4.2.3, the main difference is that here we consider the summand with signs. This means that we can choose positive inertial parameters bigger than 1, and then compensate with negative ones. As a matter of fact, as we will see in the numerical experiment, negative inertial parameter would make the convergence even faster. For instance, for the case_{s = 2 with}P

iai being fixed, then the choice a1< 0 < a0 may outperform the one with a1, a0 ≥ 0.

The two upper bounds are shown graphically in Figure4.1. It can be observed that for γ ≤ β, the largest value that can be allowed is 1, which corresponds to the choice of FISTA method whose inertial parameter tends to 1 as k → +∞. 0 0.5 1 1.5 2 .=- 0 0.2 0.4 0.6 0.8 1 'i ai min ; 1;2- ! . . < min ; 1; 2- ! . 2j- ! .j <

Figure 4.1: Two empirical upper bounds for the sum of inertial parameters P

iai: “Upper bound 1”, P

i∈Sai ∈0, min 1, 2β−γ

γ ; “Upper bound 2”, Pi∈Sai ∈0, min 1,2|β − γ|2β − γ .

Remark 4.2.8.

(i) Between the two bounds in (4.2.9), “Upper bound 2” is much less stringent than “Upper bound 1”. (ii) For inertial Forward–Backward, P

iai too close to 1 is not a good choice. Such an observation (e.g. from the numerical experiments in Section 4.5 ) coincides with the existing studies on FISTA (e.g. the local oscillation). A theoretical explanation for such behaviours of P

iai too close to 1 is left to Chapter6.

All the above remarks will be made clear in the numerical experiment section, typically for the multi- step inertial Forward–Backward splitting method.

Lastly, it should be emphasised that the two empirical bounds in (4.2.9) are designed for multi-step inertial FB, gradient descent, and the original PPA method. They may not work for the other inertial schemes. As a matter of fact, as we will see in the numerical experiments of Section4.5, the choices of inertial parameters for inertial Douglas–Rachford and Chambolle-Pock Primal–Dual splitting method [51] are rather limited. Moreover, compared to inertial Forward–Backward and gradient descent, the gains of inertia in DR and Primal–Dual splitting are very small.

The reasons underlying such differences on the acceleration brought by inertia to different algorithms is quite complicated to justify in general. However, it can be explained partly through the local linear convergence analysis as we will describe in Chapters6-8.

4.2.2.2 Unconditional convergence

Besides the conditional convergence, we can devise choices of {{a_i,k}_i∈S}_k∈N and {{b_i,k}_i∈S}_k∈N that are independent of {x_k}_k∈N, and still guarantee the global convergence. We dub this unconditional convergence. The following result generalizes those in [5,131,120,115].

For the unconditional convergence of Algorithm7, we restrict ourselves to the case_{s = 1.}

Theorem 4.2.9 (Unconditional convergence). For the inexact MUSTARD iteration (4.2.4), let conditions (A.1)-(A.3) hold. Suppose that the following conditions are satisfied

(i) For the metric sequence {V_k}_k∈N ∈ M_ν for ν > 0, suppose that there exists a non-negative sequence {η_k}_k∈N∈ `1

+ such that µ = sup k∈N

||Vk|| < +∞ and (1 + ηk)Vk < Vk+1.

(ii) choose the inertial parameters ak, bk ∈ [0, 1], such that (4.2.7) holds and moreover there exists τ > 0 and        1 + ak−_2βνγk (1 + bk)2+ ηk−1bk(bk+ 1) ≥ τ : ak≤ _2βνγk bk, 1 − (3 + 2ηk)ak−_2βνγk (1 + bk)2+ ηk−1bk(bk− 1) ≥ τ : ( b_k≤ a_k, or γk 2βνbk≤ ak< bk, (4.2.10)

(iii) the errors are {ε_k}_k∈N∈ `1

+ and {||ξk||}k∈N∈ `1+. Then P

k∈N||xk− xk−1||2 < +∞, and there exists x? ∈ zer(A + B) such that the sequence {xk}k∈N converges weakly to x?.

See Section 4.6 for the proof from page90. When the metric V_k is fixed, i.e. V_k ≡V ∈ M_ν, then ηk≡ 0 and condition (4.2.10) simplifies to

   1 + ak−_2βνγk (1 + bk)2 ≥ τ : ak≤ _2βνγk bk, 1 − 3ak−_2βνγk (1 + bk)2 ≥ τ : bk ≤ ak or _2βνγk bk ≤ ak < bk. (4.2.11)

Figure 4.2shows graphically the conditions (4.2.11). We choose τ = 0.01 and two different choices of γ are considered. It can be observed that with γ becoming bigger, the range of a, b in (4.2.10) becomes smaller. Moreover, compared to the empirical choice of inertial parameters, the allowed choices by (4.2.10) are quite conservative. For instance, for the case bk≡ ak ≡ a and γ = βν, the biggest value can be allowed is a ≡√5 − 2. In comparison, when B = 0, bk vanishes and the upper bound of ak is 1/3 which coincides with the result of [4,5].

In document Un estudio empírico sobre factores sociales y enfermedad de Crohn desde la perspectiva de la psicología social de la salud (página 118-121)