1. Marco de referencia
3.3. La migración por sexo
Steps 5 and 6 of Algorithm 3.1 were stated generically in order to emphasize the possibility of the user supplying any appropriate partitioning strategy. In this sec- tion, the strategies implemented in the algorithm and used on the examples will be discussed. First, the strategy for partitioning the X space is discussed.
X Partitioning Strategies
The intuitive picture of partitioning the X space is to take a box that may include multiple solution branches and generate new boxes that are likely to contain either no solution branches or a single solution branch. In other words, the objective of partitioning X is to separate solution branches into their own boxes. This is essentially the standard bisection strategy, except with one subtlety: the X interval is cut such that the newly generated intervals either contain solution branches on all of P or no solution branches, excluding the partial enclosure scenario altogether. Since bisecting
X at the midpoint of the ithcomponent will likely produce a partial enclosure scenario,
this strategy cannot be employed blindly, which is common in classical generalized bisection algorithms. The simplest way to avoid making a cut that generates partial enclosures is to search for a cut position that satisfies Theorem 3.4.1. The strategy that was implemented in this thesis determines a component with the maximum width
i∈ arg maxjw(Xj) and searches for a ν ∈ (0, 1), if one exists, such that
˜ xi := ν ( xLi +rad(Xi) µ ) + (1− ν) ( xUi −rad(Xi) µ ) (3.12)
satisfies Theorem 3.4.1, with µ ∈ {m ∈ R : m > 1}. That is, ˜xi is chosen such that
it does not intersect a solution branch, guaranteed by Theorem 3.4.1, and therefore avoiding a partial enclosure scenario. It is clear that the purpose of ν in (3.12) is to easily evaluate many candidate values of ˜xi that are the convex combination of some
relevant interval bounds. The parameter µ is included to give the user freedom over the interval from which ˜xican be chosen. It is a parameter that determines how much
x p P X m X( ) L x U x ( ) L rad X x µ + ( ) U rad X x µ −
2
µ =
x p P X m X( ) L x U x ( ) L rad X x µ + ( ) U rad X x µ −3
µ =
Figure 3-1: The effects of µ on the region considered for partitioning X in one di- mension.
The value for µ can be chosen freely and tuned according to the system being bounded. As µ gets very large, the interval of candidate cut positions approaches Xi and thus
there is a larger potential to find bisections very close to the original bounds of Xi
therefore producing a very narrow interval and a wide interval nearly the width of Xi.
Although this is conservative and poses no theoretical problems for the algorithm, it may be quite inefficient. Alternatively, values of µ close to 1 can pose problems for the algorithm since they limit the candidate ˜xi values to very close to the midpoint
of Xi. In general, it will be very difficult to find proper partitions of Xi that avoid
partial enclosures. The effect of µ is illustrated in Figure 3-1 and the implementation of this strategy is discussed in Section 3.6.
If a partition for Xi cannot be identified, again with i ∈ arg maxj(w(Xj)) the
strategy is to increase the µ parameter value and begin searching for a partition in all components of X. If a partition for X still cannot be identified, it is determined that the P interval must be partitioned. Figure 3-2(a) illustrates a scenario in which an interval X encloses multiple solution branches but there does not exist a partition (candidate positions represented as dashed lines) that separates them while avoiding the partial enclosure scenario. Figure 3-2(b) illustrates how, after finding a position to partition P , there exists partitions of X so that the solution branches are separated and no partial enclosures are generated. The next section discusses a strategy for
x p P X (a) x p X (b) 1 P P2
Figure 3-2: (a) A box X×P in which there does not exist a position to cut X (dashed lines) such that no partial enclosures are produced. (b) After cutting P , there exists positions to cut X avoiding partial enclosures.
cutting P .
P Partitioning Strategies
Due to the width of the parameter interval P , verifying existence and uniqueness of enclosed solution branches may be difficult or impossible, as illustrated in Fig. 3-2, even with the sharper existence and uniqueness test of Theorem 3.4.4. Therefore, a strategy for partitioning P must also be considered. An efficient strategy is to simply bisect down the middle of the widest dimension. The problem with this strategy is that, in general, each parameter may not contribute equally to overestimation and the inability to pass the existence and uniqueness test of Theorem 3.4.4. Therefore, it is desirable to have a method for determining which parameter dimension has the largest influence on the extremal eigenvalue(s) of A, as defined in Theorem 3.4.4. With such information, the parameter interval P can be partitioned in an intelligent manner with the objective of generating boxes that pass the existence and uniqueness test of Theorem 3.4.4.
By treating λmax of Theorem 3.4.4 as a real-valued function of the bounds of
X and P , if one were able to calculate pseudo-derivative information of λmax with
respect to the bounds of P , the influence of the parameter interval on passing the existence and uniqueness test of Theorem 3.4.4 presents itself. Furthermore, this
information can be used to provide intelligent positions to cut P that will yield boxes that are more likely to pass the existence and uniqueness test of Theorem 3.4.4. As was illustrated in Figure 3-2(b), how one cuts P may have a large impact on how the
X box is subsequently partitioned and on the ability to pass existence and uniqueness
tests. The following results establish how this pseudo-derivative information can be calculated and how it is used in the partitioning strategy.
Definition 3.5.2 (Piecewise Continuous Differentiable [40]). A continuous function
g : D ⊂ Rn → Rm is said to be piecewise continuously differentiable near z ∈ D if
there exists an open neighborhood N ⊂ D of z and a finite family of continuously differentiable functions g1, g2, . . . , gk : N → Rm, for k ∈ N (k > 0), such that g(y) is an element of {g1(y), g2(y), . . . , gk(y)} for all y ∈ N.
Definition 3.5.3 ([127]). Let D ⊂ IRn×n be open and let F : D → Rn×n. F will
be called piecewise continuous differentiable on D if for every piecewise continuous differentiable function M : E ⊂ R2nn→ IRn×n, the mapping F(M (· )) : E → Rn×n is
piecewise continuous differentiable on the open set ED ={a ∈ E : M(a) ∈ D}.
Definition 3.5.4 ([127]). Let D ⊂ Rn×n be open and let f : D → R. f will
be called piecewise continuous differentiable on D if for every piecewise continuous differentiable function M : E ⊂ Rnn → Rn×n, the mapping f (M(· )) : E → R is piecewise continuous differentiable on the open set ED ={a ∈ E : M(a) ∈ D}.
Lemma 3.5.5. Let DA ⊂ IRn×n be open such that for every A ∈ DA, m(A) is
nonsingular. Define the real matrix-valued function Z : DA → Rn×n as Z(A) ≡
|mid(A)−1| rad(A), ∀A ∈ D
A. Then Z is piecewise continuously differentiable on
DA.
Proof. By definition, rad(· ) and m(· ) are continuously differentiable on DA. Since
m(A) is nonsingular ∀A ∈ DA, m(· )−1 exists and can be expressed using only ele-
mentary arithmetic operations, and is therefore continuously differentiable on DA. By
definition, |· | is piecewise continuously differentiable on DA. It immediately follows
Assumption 3.5.6. Let D ⊂ Rn×n be open. Let Z and D
A be as in Lemma 3.5.5
such that Z(A) ∈ D for every A ∈ DA. Let λi(Z(A)) be the ith real eigenvalue of
Z(A). It will be assumed that λi : D → R is piecewise continuously differentiable on
D.
Theorem 3.5.7. Suppose Assumption 3.5.6 holds. Define the function λmax :Rn×n →
R as λmax(Z(A)) = maxi{|λi(Z(A))|}, the magnitude of the extremal eigenvalue(s) of
Z(A), ∀A ∈ DA. Then λmax is differentiable on D at all points outside of a Lebesgue
nullset.
Proof. By Assumption 3.5.6, λi is piecewise continuously differentiable on D. There-
fore, λi is locally Lipschitz by Corollary 4.1.1 in [126]. By Proposition 4.1.2 in [126],
λmax is locally Lipschitz. From Theorem 3.1.1 in [40], λmax is differentiable on D at
all points outside of a Lebesgue nullset.
From Theorem 3.5.7, λmaxis differentiable almost everywhere, provided λiis piece-
wise continuously differentiable on D. The Lipschitz result of λmax may be too re-
strictive since, in general, λmax is not Lipschitz for non-symmetric matrices. However
pseudo-derivative information may still be available since it may be possible to cal- culate subgradient information of λmax (e.g. see [23]). Now, λmax will be defined
more precisely as a function λmax : D ⊂ Rnx×nx → R with λmax(A(Xl, Pl)) as the
extremal eigenvalue(s) of the matrix A(Xl, Pl) ≡ |m(J
x(Xl, Pl))−1|rad(Jx(Xl, Pl)) with (Xl, Pl) ∈ ID
x× IDp. By expressing the parameter space P as the real vector pB = (pL
1, pU1, . . . , pLnp, p U np)
T, the gradient vector of λ
max with respect to pB evaluated
at A(Xl, Pl), if it is defined, can be expressed as
∇pBλmax(A(Xl, Pl)) = ( ∂λmax ∂pL 1 (A(Xl, Pl)),∂λmax ∂pU 1 (A(Xl, Pl)), . . . , . . . ,∂λmax ∂pL np (A(Xl, Pl)),∂λmax ∂pU np (A(Xl, Pl)) ) .
Computationally, λmax can be approximated using a fixed-point iteration such as
the algorithm calculates it. Furthermore, ∇pBλmax can be evaluated using forward automatic differentiation [50], which was performed for this chapter using an in-house C++ library. The following procedure defines the strategy for partitioning P .
Subroutine 3.5.8.
1. For a box (Xl, Pl)∈ IX0× IP0, evaluate ∇
pBλmax(A(Xl, Pl)).
2. Choose the component of∇pBλmax with the largest magnitude and determine its
corresponding component of Pl. Store this component as j
max. If w(Pjlmax) < ϵP,
Pjlmax is too narrow to be cut. Choose the component with the next largest magnitude and repeat until a jmax is found with w(Pljmax) > ϵP. If no such
jmax exists, terminate. P cannot be partitioned further.
3. Let pjmax = ( pl,Lj max, p l,U jmax ) and d = ((∇pBλmax(A(Xl, Pl)) ) 2jmax−1, ( ∇pBλmax(A(Xl, Pl)) ) 2jmax ) and take a steepest-descent step pnew
jmax := pjmax − αd with α > 0.
4. Check if (pnew
jmax)1 < (p new
jmax)2 and that (p new
jmax)1 and (p new
jmax)2 are separated at least
by ϵP. If so, continue, else bisect Pjlmax down the middle and return two new
boxes to the main algorithm to be placed on the stack. 5. Set Pi∗ := Pi for i̸= jmax and Pj∗max := [(p
new jmax)1, (p
new
jmax)2]. Check if
λmax(A(Xl, P∗)) < 1.5 and w(Pj∗max) > max{
1 3w(P l jmax), 1 4w(P 0 jmax). If true, partition Pl jmax at (p new jmax)1 and (p new
jmax)2 and return three new boxes to the main
algorithm to be placed on the stack. 6. Set Pl
jmax := [(p new jmax)1, (p
new
jmax)2] and go to step 3.
Both the standard partitioning strategy of bisection in the widest dimension as well as that of Subroutine 3.5.8 were implemented as part of Algorithm 3.1.