DM55 Sistemas de gestión y ahorro de agua

The following part is mainly transferred literally from Beume et al. (2010)∗. Convergence properties of EMOA are yet not well understood. More recently, the- ory concentrated on the convergence or runtime of simple EMOA on special discrete problems, considering whether and how quickly the Pareto set is reached. For the case of a continuous search space _Rn _{only a few results exist for specialized algo-}

rithms, the first obtained by Rudolph (1998). He showed that a multi-objective (1+1)-EA that accepts incomparable points with probability 1₂ converges with probability 1 to the Pareto set if the step size is chosen proportional to the distance to the Pareto set, while two other step size concepts fail. Hanne (1999) considered stochastic convergence of EMOA with different selection schemes, the possibilities of temporary fitness deterioration, and on problems with unattainable solutions. A recent subject of interest has been whether a certain distribution on the Pareto front can be obtained that is optimal regarding specified preferences.

Despite these advances, the convergence rate in continuous space remains a ne- glected topic. Teytaud (2007) shows that the convergence rate scales badly with increasing number of objectives entailing that any comparison-based EMOA per- forms hardly better than random search for a large number of objectives. Also a general lower bound for the convergence time is given.

Convergence

The following part is mainly transferred literally from Beume et al. (2011)∗.

Definition 3.2 LetX be a random variable and(Xt)a sequence of random vari- ables defined on a probability space (Ω,_A, P). Then (Xt)is said to

(a) converge completely to X, if for any >0

lim t→∞ t X i=1 P r(_|Xi−X|> )<∞;

(b) converge almost surely or with probability 1 to X, if

P r( lim

t→∞|Xt−X|= 0) = 1;

t→∞P r(|Xt−X|> ) = 0;

(d) converge in mean toX, if

lim

t→∞E(|Xt−X|) = 0.

The velocity of approaching a limit is expressed by theconvergence rate.

Definition 3.3 Let (Zk : k ≥ 0) be a non-negative random sequence. The sequence is said toconverge geometrically fast in mean (in probability, w.p. 1) to zero

if there exists a constant q > 1 such that the sequence (qk_Z

k : k ≥ 0) converges in mean (in probability, w.p. 1) to zero. Let q∗ >1 be supremum of all constants

q > 1 such that geometrically fast convergence is still guaranteed. Then c = 1/q

is called the convergence rate. A sequence with geometrically fast convergence is synonymously denoted to have a linear convergence rate. Let ρ(_·) denote a function that measures the performance of an EA’s population

Xk and ρ∗ the target value. If the sequence (Zk)k≥0 defined by Zk =|ρ(Xk)−ρ∗|

converges (in any mode mentioned above) to zero with a certain convergence rate, then the EA approaches the target performance value with this rate.

For example, let ρ(Xk) be the best objective function value of the population at

generation k _≥ 0 of a single-criterion EA and ρ∗ be the global minimum of the objective function. If Zk converges to zero then the EA converges to the global

minimum. Similarly, let ρ(Xk) be the dominated hypervolume of population Xk

and ρ∗ the maximal dominated hypervolume in the multi-objective scenario then the population converges to the maximum dominated hypervolume if Zk → 0 as

k_{→ ∞}. For the convergence analysis in the Sections 3.2.2 and 3.2.3 we consider the convergence of the population towards the Pareto front. Thereby, ρ(Xk) denotes

the distance to a certain point on the Pareto front.

Note that we analyze algorithms w. r. t. their black-box-complexity (cf. Section 1.4), i.e., we consider the number of function evaluations and express convergence rates in this measure.

Convexity

Definition 3.4 A set S _⊆ _Rn _{is said to be} _convex _if _ξ_x_{+ (1}₋_ξ)_y _∈ _S _{for all} x,y_∈S and ξ_∈[0,1]. A function f :S _→_R is termed

(a) convex if f(ξx+ (1₋ξ)y)_≤ξ f(x) + (1₋ξ)f(y),

f(ξx+ (1₋ξ)y)_≤ξ f(x) + (1₋ξ)f(y) + L

2 ξ(1−ξ)kx−yk

2_,

(d) (K, Q)-strongly convex if it is strongly convex and

K 2 ξ(1−ξ)kx−yk 2 ≤ξ f(x) + (1₋ξ)f(y)₋f(ξx+ (1₋ξ)y) ≤ L 2 ξ(1−ξ)kx−yk 2 _(3.10) with K, L_∈_R+_,₀_{< K} _≤_{L <}_∞_and _Q₌_L/K_.

Definition 3.4.(a) says that a function is convex if its epigraph (the set of points lying on or above its graph) is convex. For convex functions any local minimum is a global one, i.e., minima have equal function values. Strictly convex functions are convex with a unique minimizer. Strongly convex functions are a subclass fulfilling a tighter bound of the inequality, whereas for(K, Q)-strongly convex functions the relation of the terms is bounded from two sides. The inequalities become more precise with increasing values of the parameters K and L.

Finally, notice that f(_·) is termed concave if ₋f(_·) is convex.

Definition 3.5 A symmetric quadratic matrix A with eigenvalues _{{ν1, . . . , νn}}

1. positive semidefinite iff νi ≥0,∀i∈ {1, . . . , n}

2. positive definite iffνi >0,∀i∈ {1, . . . , n}

3. negative semidefinite iff νi ≤0,∀i∈ {1, . . . , n}

4. negative definite iff νi <0,∀i∈ {1, . . . , n}

5. indefinite iff_∃νi <0 and ∃νj >0, withi, j ∈ {1, . . . , n}.

Theorem 3.6 (cf. e.g. Hiriart-Urruty and Lemaréchal (2001, Th. 4.3.1)) Let f :

Rn → R be twice continuously differentiable and Q denote its Hessian matrix. Then it holds:

1. f is convex _⇐⇒ Q is positive semidefinite 2. f is strictly convex _⇐= Q is positive definite 3. f is concave _⇐⇒ Qis negative semidefinite 4. f is strictly concave _⇐= Q is negative definite.

Lemma 3.7 Letf1(x) = a>x+a0 and f2(x) = b>x+b0 be linear functions with a,b,x _∈ _Rn_{, with} _a

0, b0 ∈ R, and n ≥ 2. The dominated hypervolume for a

fixed reference point r _∈ _R2 _{is a concave function if the matrix} _{a b}> _{is negative}

semidefinite. (Beume et al. (2011)∗).

Proof. Notice that the dominated hypervolume

H(_{f(x)_},r) = [r1−f1(x)] [r2−f2(x)]

= [r1−(a>x+a0)] [r2−(b>x+b0)]

= [(r1−a0)−a>x] [(r2−b0)−b>x]

= (r1−a0) (r2−b0)−[(r1−a0)b+ (r2−b0)a]>x+a>x·b>x

= (r1−a0) (r2−b0)−[(r1−a0)b+ (r2−b0)a]>x+x>(a b>)x

is a quadratic form which is concave iffa b> is negative semidefinite.

Example: ab0 is negative semidefinite ifb =₋a sinceaa> is positive semidefinite. Lemma 3.7 is used in Section 3.2.4.

Definition 3.8 A symmetric positive semidefinite matrix A has bounded bandwidth κ _≥ 1 iff for all eigenvalues ν1 ≤ ν2 ≤ . . . ≤ νn < ∞ holds: ∃κ ≥ 1 : ∀i =

1, . . . , n: νi ∈[ν1, κ·ν1], withκ being a constant.

A functionf(x) = 1₂x>Ax+b>x+cis denoted to have bounded bandwidth iff its matrix A has bounded bandwidth.

Note that the bounded bandwidth cannot be fulfilled if one but not all eigenvalues are zero. So it can only hold for the zero matrix or positive definite matrices. Consider a small counter example where the eigenvalues cannot be bounded by a constant. In the diagonal matrix A the diagonal entries correspond to the eigenvalues which all are positive, so the matrix is positive definite. Then, there is no constantκ such that the interval [1, κ_·1] contains the largest eigenvaluen:

A =   1 0 0 0 1 0 0 0 n   , {{1,1, n}} ∈[1, κ·1] ⇒ κ is a constant.

Lemma 3.9 Let f(x) = 1₂x>Ax+b>x+c be a quadratic convex function with bounded bandwidth, and A is not the zero matrix. Then, its matrix A is positive definite and has bounded bandwidth.

Proof. As f is convex, it follows that A is positive semidefinite, so _∀i = 1, . . . , n:

νi ≥0. The condition∃κ≥1 :∀i= 1, . . . , n:νi ∈[ν1, κ·ν1]of the bounded band-

width is only fulfilled ifAis the zero matrix (which we exclude from consideration) or if_∀i= 1, . . . , n:νi >0holds, and thusAis even positive definite, which implies

Algorithmical Setup

We consider the SMS-EMOA not only with the standard(µ+ 1)-selection but also other schemes of the(µ+_{, λ}₎_{framework. A new selection scheme with a tournament}

selection based on pairwise comparisons is explained in detail in Section 3.2.3. Several mutation operators are considered in order to calculate convergence rates, or respectively transfer known results. We do not provide convergence rates for recombination.

The reference point is mainly chosen following the concept of the adaptive reference point. Recall that points that are the worst ones in the population regarding an objective function have a distance to the adaptive reference point of exactly 1 w. r. t. that worst objective (cf. Fig. 3.2). This fact is exploited to gain the convergence result described in Section 3.2.2.1.

From Lemma 2.5 we know that in case of rating two points by their absolute hypervolume, the resulting order is the same as if the points were rated by their hypervolume contribution. We refer to this lemma to argue that both measures behave equally in case of selecting among two points.

Algorithmic Equivalence

The following part is mainly transferred literally from Beume et al. (2011)∗. We exploit that convergence results for one EA [. . . ] can be transferred to an- other in case the algorithms behave similar. Therefore, we introduce the following definition of algorithmic equivalence:

Definition 3.10 Let (Xt)t≥0 and (Yt)t≥0 be two stochastic sequences of states

generated by two evolutionary algorithms AX and AY. The EA AX and AY are called algorithmically equivalentif _∀t_≥0 :Xt

=Yt holds for their associated state sequences, i.e., Xt and Yt have the same distribution for all t≥0.

In particular, two algorithms are algorithmically equivalent if both use the same probability distribution for probabilistic decisions, and their deterministic decisions are equal. We use the term in this way by assuming equal probabilistic operators (initialization, variation) and showing that the deterministic selection operators produce the same output in case of the same input for certain classes of problems. We do not consider the computational resources of the EA operators but only the input-output-behavior, thus the state of the EA’s population.

3.2.2 SMS-EMOA with Adaptive Reference Point

In document Anexo II. Fichas de los ítems (página 179-181)