La migración por sexo - Marco de referencia

1. Marco de referencia

3.3. La migración por sexo

Steps 5 and 6 of Algorithm 3.1 were stated generically in order to emphasize the possibility of the user supplying any appropriate partitioning strategy. In this section, the strategies implemented in the algorithm and used on the examples will be discussed. First, the strategy for partitioning the X space is discussed.

X Partitioning Strategies

The intuitive picture of partitioning the X space is to take a box that may include multiple solution branches and generate new boxes that are likely to contain either no solution branches or a single solution branch. In other words, the objective of partitioning X is to separate solution branches into their own boxes. This is essentially the standard bisection strategy, except with one subtlety: the X interval is cut such that the newly generated intervals either contain solution branches on all of P or no solution branches, excluding the partial enclosure scenario altogether. Since bisecting

X at the midpoint of the ith_{component will likely produce a partial enclosure scenario,}

this strategy cannot be employed blindly, which is common in classical generalized bisection algorithms. The simplest way to avoid making a cut that generates partial enclosures is to search for a cut position that satisﬁes Theorem 3.4.1. The strategy that was implemented in this thesis determines a component with the maximum width

i∈ arg maxjw(Xj) and searches for a ν ∈ (0, 1), if one exists, such that

˜ xi := ν ( xL_i +rad(Xi) µ ) + (1− ν) ( xU_i −rad(Xi) µ ) (3.12)

satisﬁes Theorem 3.4.1, with µ ∈ {m ∈ R : m > 1}. That is, ˜xi is chosen such that

it does not intersect a solution branch, guaranteed by Theorem 3.4.1, and therefore avoiding a partial enclosure scenario. It is clear that the purpose of ν in (3.12) is to easily evaluate many candidate values of ˜xi that are the convex combination of some

relevant interval bounds. The parameter µ is included to give the user freedom over the interval from which ˜xican be chosen. It is a parameter that determines how much

x p P X m X( ) L x U x ( ) L rad X x µ + ( ) U rad X x µ −

2 µ =

x p P X m X( ) L x U x ( ) L rad X x µ + ( ) U rad X x µ −

3 µ =

Figure 3-1: The eﬀects of µ on the region considered for partitioning X in one di- mension.

The value for µ can be chosen freely and tuned according to the system being bounded. As µ gets very large, the interval of candidate cut positions approaches Xi and thus

there is a larger potential to ﬁnd bisections very close to the original bounds of Xi

therefore producing a very narrow interval and a wide interval nearly the width of Xi.

Although this is conservative and poses no theoretical problems for the algorithm, it may be quite ineﬃcient. Alternatively, values of µ close to 1 can pose problems for the algorithm since they limit the candidate ˜xi values to very close to the midpoint

of Xi. In general, it will be very diﬃcult to ﬁnd proper partitions of Xi that avoid

partial enclosures. The eﬀect of µ is illustrated in Figure 3-1 and the implementation of this strategy is discussed in Section 3.6.

If a partition for Xi cannot be identiﬁed, again with i ∈ arg maxj(w(Xj)) the

strategy is to increase the µ parameter value and begin searching for a partition in all components of X. If a partition for X still cannot be identiﬁed, it is determined that the P interval must be partitioned. Figure 3-2(a) illustrates a scenario in which an interval X encloses multiple solution branches but there does not exist a partition (candidate positions represented as dashed lines) that separates them while avoiding the partial enclosure scenario. Figure 3-2(b) illustrates how, after ﬁnding a position to partition P , there exists partitions of X so that the solution branches are separated and no partial enclosures are generated. The next section discusses a strategy for

x p P X (a) x p X (b) 1 P P2

Figure 3-2: (a) A box X×P in which there does not exist a position to cut X (dashed lines) such that no partial enclosures are produced. (b) After cutting P , there exists positions to cut X avoiding partial enclosures.

cutting P .

P Partitioning Strategies

Due to the width of the parameter interval P , verifying existence and uniqueness of enclosed solution branches may be difficult or impossible, as illustrated in Fig. 3-2, even with the sharper existence and uniqueness test of Theorem 3.4.4. Therefore, a strategy for partitioning P must also be considered. An efficient strategy is to simply bisect down the middle of the widest dimension. The problem with this strategy is that, in general, each parameter may not contribute equally to overestimation and the inability to pass the existence and uniqueness test of Theorem 3.4.4. Therefore, it is desirable to have a method for determining which parameter dimension has the largest influence on the extremal eigenvalue(s) of A, as defined in Theorem 3.4.4. With such information, the parameter interval P can be partitioned in an intelligent manner with the objective of generating boxes that pass the existence and uniqueness test of Theorem 3.4.4.

By treating λmax of Theorem 3.4.4 as a real-valued function of the bounds of

X and P , if one were able to calculate pseudo-derivative information of λmax with

respect to the bounds of P , the inﬂuence of the parameter interval on passing the existence and uniqueness test of Theorem 3.4.4 presents itself. Furthermore, this

information can be used to provide intelligent positions to cut P that will yield boxes that are more likely to pass the existence and uniqueness test of Theorem 3.4.4. As was illustrated in Figure 3-2(b), how one cuts P may have a large impact on how the

X box is subsequently partitioned and on the ability to pass existence and uniqueness

tests. The following results establish how this pseudo-derivative information can be calculated and how it is used in the partitioning strategy.

Deﬁnition 3.5.2 (Piecewise Continuous Diﬀerentiable [40]). A continuous function

g : D ⊂ Rn → Rm is said to be piecewise continuously diﬀerentiable near z ∈ D if

there exists an open neighborhood N ⊂ D of z and a ﬁnite family of continuously diﬀerentiable functions g1, g2, . . . , gk : N → Rm, for k ∈ N (k > 0), such that g(y) is an element of {g1_{(y), g}2_{(y), . . . , g}k_(y)_{} for all y ∈ N.}

Deﬁnition 3.5.3 ([127]). Let D ⊂ IRn×n be open and let F : D → Rn×n_{. F will}

be called piecewise continuous diﬀerentiable on D if for every piecewise continuous diﬀerentiable function M : E ⊂ R2nn_{→ IR}n×n_{, the mapping F(M (}_{· )) : E → R}n×n _is

piecewise continuous diﬀerentiable on the open set ED ={a ∈ E : M(a) ∈ D}.

Deﬁnition 3.5.4 ([127]). Let D ⊂ Rn×n be open and let f : D → R. f will

be called piecewise continuous differentiable on D if for every piecewise continuous differentiable function M : E ⊂ Rnn → Rn×n, the mapping f (M(· )) : E → R is piecewise continuous differentiable on the open set ED ={a ∈ E : M(a) ∈ D}.

Lemma 3.5.5. Let DA ⊂ IRn×n be open such that for every A ∈ DA, m(A) is

nonsingular. Deﬁne the real matrix-valued function Z : DA → Rn×n as Z(A) ≡

|mid(A)−1_{| rad(A), ∀A ∈ D}

A. Then Z is piecewise continuously diﬀerentiable on

DA.

Proof. By deﬁnition, rad(· ) and m(· ) are continuously diﬀerentiable on DA. Since

m(A) is nonsingular ∀A ∈ DA, m(· )−1 exists and can be expressed using only ele-

mentary arithmetic operations, and is therefore continuously diﬀerentiable on DA. By

deﬁnition, |· | is piecewise continuously diﬀerentiable on DA. It immediately follows

Assumption 3.5.6. Let D ⊂ Rn×n _{be open. Let Z and D}

A be as in Lemma 3.5.5

such that Z(A) ∈ D for every A ∈ DA. Let λi(Z(A)) be the ith real eigenvalue of

Z(A). It will be assumed that λi : D → R is piecewise continuously diﬀerentiable on

Theorem 3.5.7. Suppose Assumption 3.5.6 holds. Deﬁne the function λmax :Rn×n →

R as λmax(Z(A)) = maxi{|λi(Z(A))|}, the magnitude of the extremal eigenvalue(s) of

Z(A), ∀A ∈ DA. Then λmax is diﬀerentiable on D at all points outside of a Lebesgue

nullset.

Proof. By Assumption 3.5.6, λi is piecewise continuously diﬀerentiable on D. There-

fore, λi is locally Lipschitz by Corollary 4.1.1 in [126]. By Proposition 4.1.2 in [126],

λmax is locally Lipschitz. From Theorem 3.1.1 in [40], λmax is diﬀerentiable on D at

all points outside of a Lebesgue nullset.

From Theorem 3.5.7, λmaxis diﬀerentiable almost everywhere, provided λiis piece-

wise continuously diﬀerentiable on D. The Lipschitz result of λmax may be too re-

strictive since, in general, λmax is not Lipschitz for non-symmetric matrices. However

pseudo-derivative information may still be available since it may be possible to cal- culate subgradient information of λmax (e.g. see [23]). Now, λmax will be deﬁned

more precisely as a function λmax : D ⊂ Rnx×nx → R with λmax(A(Xl, Pl)) as the

extremal eigenvalue(s) of the matrix A(Xl_{, P}l₎ _{≡ |m(J}

x(Xl, Pl))−1|rad(Jx(Xl, Pl)) with (Xl_{, P}l₎ ∈ ID

x× IDp. By expressing the parameter space P as the real vector pB _{= (p}L

1, pU1, . . . , pLnp, p U np)

T_{, the gradient vector of λ}

max with respect to pB evaluated

at A(Xl_{, P}l_{), if it is deﬁned, can be expressed as}

∇pBλ_max(A(Xl, Pl)) = ( ∂λmax ∂pL 1 (A(Xl, Pl)),∂λmax ∂pU 1 (A(Xl, Pl)), . . . , . . . ,∂λmax ∂pL np (A(Xl, Pl)),∂λmax ∂pU np (A(Xl, Pl)) ) .

Computationally, λmax can be approximated using a ﬁxed-point iteration such as

the algorithm calculates it. Furthermore, ∇_pBλ_max can be evaluated using forward automatic diﬀerentiation [50], which was performed for this chapter using an in-house C++ library. The following procedure deﬁnes the strategy for partitioning P .

Subroutine 3.5.8.

1. For a box (Xl_{, P}l₎∈ IX0× IP0_{, evaluate} ∇

pBλ_max(A(Xl, Pl)).

2. Choose the component of∇pBλ_max with the largest magnitude and determine its

corresponding component of Pl_{. Store this component as j}

max. If w(Pjlmax) < ϵP,

P_jl_max is too narrow to be cut. Choose the component with the next largest magnitude and repeat until a jmax is found with w(Pljmax) > ϵP. If no such

jmax exists, terminate. P cannot be partitioned further.

3. Let pjmax = ( pl,L_j max, p l,U jmax ) and d = ((∇pBλ_max(A(Xl, Pl)) ) 2jmax−1, ( ∇pBλ_max(A(Xl, Pl)) ) 2jmax ) and take a steepest-descent step pnew

jmax := pjmax − αd with α > 0.

4. Check if (pnew

jmax)1 < (p new

jmax)2 and that (p new

jmax)1 and (p new

jmax)2 are separated at least

by ϵP. If so, continue, else bisect Pjlmax down the middle and return two new

boxes to the main algorithm to be placed on the stack. 5. Set P_i∗ := Pi for i̸= jmax and Pj∗max := [(p

new jmax)1, (p

new

jmax)2]. Check if

λmax(A(Xl, P∗)) < 1.5 and w(Pj∗max) > max{

1 3w(P l jmax), 1 4w(P 0 jmax). If true, partition Pl jmax at (p new jmax)1 and (p new

jmax)2 and return three new boxes to the main

algorithm to be placed on the stack. 6. Set Pl

jmax := [(p new jmax)1, (p

new

jmax)2] and go to step 3.

Both the standard partitioning strategy of bisection in the widest dimension as well as that of Subroutine 3.5.8 were implemented as part of Algorithm 3.1.

In document NOTAS DE POBLACION. Revista Latinoamericana de Demografía (página 61-73)