• No se han encontrado resultados

We proceed to the convergence proofs. We first prove convergence when (dk)k∈N are chosen stochastically. As in the previous chapter, we suppose that (dk)k∈N are randomly, independently drawn, and that the support of the probability density of Ξ is dense in Sn−1.

We will make use of the following results from [9], which extend the Lipschitz continuity of the Clarke generalised derivative to the constrained case for the hypertangent cone, and continuity to its closure, the Clarke tangent cone.

Proposition 5.17 ([9, Lemma 3.8]). Let F be Lipschitz continuous with Lipschitz constant L> 0 near x ∈ Ω. If d and e are in TH(x), then

5.3 Convergence of the algorithm 109

Proposition 5.18 ([9, Proposition 3.9]). Let F be Lipschitz continuous near x ∈ Ω. If Ω is epi-Lipschitzian at x and d∈ TCl Ω (x), then Fo(x; d) = lim e→d, e∈TH Ω(x) Fo(x; e).

Note that it follows from Proposition 5.18 that if Ω is epi-Lipschitzian at x, then for Clarke stationarity on Ω, it is sufficient to verify optimality on TH(x) rather than TC

Ω(x).

We define X to be the set of nonstationary points,

X = {x ∈ Ω : F is not Clarke stationary at x restricted to Ω}. (5.6) Theorem 5.19. Let F : Ω → R be locally Lipschitz continuous, coercive, and bounded below, where Ω ⊂ Rn is epi-Lipschitzian. Let (xk)k∈N solve Algorithm 4 for εx = εF = 0, where

(dk)k∈N are independently drawn from the random distribution Ξ, and suppose that the support of the density of Ξ is dense in Sn−1. Then P(S ∩ X ̸= ∅) = 0, i.e. the limit set S is almost surely in the set of stationary points.

Proof.This proof is analogous to the proof for the unconstrained case, with the exception of additional treatment of points satisfying the constraints, and the progress parameter γ.

We will construct a countable collection of open sets (Bj)j∈N, such that X ⊂Sj∈NBj

and so that for all j ∈ N we have P(S ∩ Bj̸= ∅) = 0. Then the result follows from countable

additivity of probability measures.

First, we show that for every x ∈ X , there is d ∈ Sn−1∩ TH

Ω(x), ε > 0, and δ > 0 such that

y− λ e ∈ Ω, F(y − λ e) − F(y)

λ ≤ −ε, ∀y ∈ Bδ(x), e ∈ Bδ(d) ∩ S

n−1, λ ∈ (0, δ ). (5.7)

To show this, note that if x ∈ X ⊂ Ω, then as Ω is epi-Lipschitzian, there is d ∈ Sn−1∩ TH(x) and ε > 0 such that

Fo(x; −d) = lim sup

y→x, λ ↓0 y∈Ω, y+λ d∈Ω

F(y − λ d) − F(y)

λ ≤ −ε.

Therefore, there is η > 0 such that for all λ ∈ (0, η) and all y ∈ Bη(x), we have

F(y − λ d) − F(y)

As F is Lipschitz continuous around Bη(x), it is clear that the mapping

e7→ F(y − λ e) − F(y)

λ ,

is also locally Lipschitz continuous. By this and since d ∈ TH(x), it follows that there exists δ ∈ (0, η ) such that for all y ∈ Bδ(x) ∩ Ω, all e ∈ Bδ(d) ∩ Sn−1, and all λ ∈ (0, δ ), we have

y− λ e ∈ Ω, and F(y − λ e) − F(y)

λ ≤ −ε/3. This concludes the first part.

Next, for m ∈ N, we define the set Xm=

n

x∈ X : (5.7) holds for some d ∈ Sn−1∩ TH(x), ε > 0, and all δ < 1/mo. Clearly

X = [

m∈N

Xm.

Let (yi)i∈Nbe a dense sequence in Xm, which exists because Qnis both countable and dense

in Ω. We define Yi(m)= Bδ(yi), where δ = m+11 . Therefore,

Xm⊂ [ i∈N Yi(m) =⇒ X⊂ [ m∈N [ i∈N Yi(m).

Since a countable union of countable sets is countable, we conclude with the following statement. For each i ∈ N there is yi∈ Ω, εi> 0, δi> 0, and edi∈ Sn−1∩ TH(yi), such that

for all z ∈ Bδi(yi), all ed∈ Bδi( edi) ∩ Sn−1, and all λ ∈ (0, δi), we have

z− λ ed∈ Ω, F(z − λ ed) − F(z)

λ ≤ −εi, and such that

X⊂ [

i∈N

Bδi(yi). (5.8)

Finally, we show that for each i ∈ N, almost surely, S ∩ Bδi(y

i) = ∅. For a given i, write

Bi:= Bδi(yi), and define m := minx∈BiF(x), M := maxx∈BiF(x). We argue accordingly: The

existence of an accumulation point of (xk)k∈Nin Biwould imply that there is a subsequence

(xkj)

j∈N⊂ Bi. Suppose xkj ∈ Bi and dkj+1∈ Bδi( ed

i). Then xkj+1= xkj− α kjd

5.4 Numerical experiments 111

αkj either solves (5.2) for τkj or solves (5.2) for someτekj∈ (0, τkj] and such that αkj ≥ γδi

(since any λ such that xk− λ dk+1∈ Ω must be greater than δ/ i).

If αkj solves (5.2) for τkj, then by the analysis in the proof of Theorem 4.6, we know that

F(xkj) − F(xkj+1) ≥ min ( εi2τmin, δi2 τmax ) .

Otherwise, αkj> γδiand there isτekj∈ (0, τmax] such that x

kj+1= xkj+ α kjd kj+1 and F(xkj) − F(xkj+1) = 1 e τkjα 2 kj. In this case, F(xkj) − F(xkj+1) = 1 e τkj αk2j≥ 1 τmax αk2j ≥ 1 τmax γ2δi2.

Setting µ = min{εi2τmin, δi2/τmax, γ2δi2/τmax}, it follows that whenever xkj∈ Biand dkj+1∈

Bδi( edi), then

F(xkj) − F(xkj+1) ≥ µ.

Choosing K ∈ N such that Kµ > M − m, we know that this event only has to occur K times for (xkj)

j∈Nto leave Bi. In other words, almost surely, there is no subsequence (xkj)j∈N⊂ Bi.

This concludes the proof.

Deterministic case

We now state the deterministic case, in which (dk)k∈Nis required to be cyclically dense. Its proof is simply that of Theorem 4.10, but referring to the proof of Theorem 5.19 instead of Theorem 4.6.

Theorem 5.20. Let F : Ω → R be locally Lipschitz continuous, coercive, and bounded below, where Ω ⊂ Rn is epi-Lipschitzian. Let (xk)k∈N solve Algorithm 4 for εx = εF = 0, where

(dk)k∈Nare cyclically dense. Then the limit set S is in the set of stationary points.

5.4

Numerical experiments

We consider some simple numerical examples on R2. Algorithm 4 has been implemented on MATLAB, using a simple bisectional search method to solve the scalar equation. For the algorithmic parameters, we have chosen τk= 1 for all k ∈ N, dkdrawn independently from

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

Fig. 5.1 Numerical results for the optimisation problem in (5.9), with iterates going from red to black.

the uniform distribution on Sn−1, γ = 0.5, εx= 10−5, and εF = 10−5. The algorithm is set to

stop if the objective value has decreased by less than εF over the last 30 iterates.

We first consider the optimisation problem (5.1) with F : R2→ R and Ω given by F(x) := max{| cos(x1+ x2) + sin(3x2)|, | sin(x1+ 1)|},

Ω := {x ∈ R2 : (x1− 4)2+ (x2− 2.7)2≤ 4}.

(5.9)

The results are plotted in Figure 5.1, with increasing iterates plotted with darker colours, and the infeasible region plotted in yellow.

Next, we consider the optimisation problem with nonsmooth, nonconvex constraints, given by

F(x) := max{| cos(x1+ x2+ 3) + sin(3x2− 0.5)|, | sin(x1+ 4.5)|},

Ω := {−2x1+ 1.2x2≤ −4} ∪ {x1+ x2≥ 6} ∪ {x1+ 2x2≤ 11.5} . . .

∩ {−2x1+ x2≥ −7} ∩ {−x1+ x2≤ −1}

(5.10)

The results are plotted in Figure 5.2.

5.5

Conclusion and outlook

In this chapter, we have extended the Itoh–Abe methods to constrained optimisation prob- lems, and proven convergence guarantees to Clarke stationary points in this setting. This extension and analysis is important because bilevel problems and simulation-based parameter

5.5 Conclusion and outlook 113 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

Fig. 5.2 Numerical results for the optimisation problem in (5.10), with iterates going from red to black.

optimisation problems, which constitute a central motivation for the study of black-box optimisation methods, often feature constraints in the domain. Furthermore, these constraints might be implicitly defined so that we have no information of the feasible set beyond ver- ifying feasibility point by point. It is therefore important to treat this in the nonsmooth, nonconvex, derivative-free optimisation framework. We apply these methods to solve some simple, numerical examples. A wider numerical investigation of this approach is left for future work.

Chapter 6

Bregman discrete gradient methods for

sparse optimisation

6.1

Introduction

This chapter is based on the article [20] published in the Journal of Mathematical Imaging and Vision, and is joint work with Martin Benning and Carola-Bibiane Schönlieb.

In Chapters 3–5, we studied the discrete gradient method applied to gradient flow in various optimisation settings. In this chapter, we study these methods applied to a different dissipative flow, namely the inverse scale space (ISS) flow.

We consider the constrained optimisation problem min

x∈ΩF(x), (6.1)

for an objective function F : Rn→ R and constraint Ω ⊂ Rn. The function F may be

nonconvex and nonsmooth, as outlined in Assumption 6.1. In this chapter, we propose and study discrete gradient methods applied to the ISS flow.

The ISS flow is a differential system which generalises gradient flows by replacing the Euclidean distance by a Bregman distance, defined via a convex Bregman distance generating function J : Rn→ R. The ISS flow is given by

˙

p(t) = −∇F(x(t)), p(t) ∈ ∂ J(x(t)). (6.2) The term inverse scale space flow goes back to Scherzer & Groetsch [201]. It is typically derived as the continous-time limit of Bregman iterations (2.3). Like the gradient flow, the ISS flow is a dissipative system, and its dissipative structure is determined by the function J.

This allows one to solve (6.1) while incorporating a priori information into the optimisation scheme, with the benefits of converging to superior solutions, and doing so faster. The drawback of these methods is that the updates are in general implicit. Nevertheless, for many simple variational problems, the updates turn out to be explicit.

In this chapter, we study the Itoh–Abe discrete gradient method applied to the ISS flow. We prove that the method is well-defined and converges to a set of stationary points for nonsmooth, nonconvex functions. Furthermore, building on the paper by Miyatake et al. [153] where they establish the equivalence between the discrete gradient methods for linear systems and successive-over-relaxation (SOR) methods, we point out equivalencies of various approaches to least squares problems.

Bregman iterations, and related methods, are closely tied to inverse problems and regular- isation methods, particularly in signal processing. We consider numerical examples in this setting as well.

Documento similar