Marcos urbanos y gobernanza
1. Subrayando algunos enfoques de la nueva agenda urbana ( nau )
We follow Boyd et al. [2011] to introduce the Alternating Direction Method of Multi- pliers (ADMM), a widely used method for solving convex optimization problems with a large number of variables. ADMM works by decomposing a large optimization prob- lem into a set of smaller subproblems which are easier to handle. Before introducing ADMM, we first review a precursor method, theLagrangian dual ascent method.
2.4.1.1 Lagrangian Dual Ascent Method
Consider an equality-constrained convex problem of the form
min
x,z f(x) +g(z) (2.34)
s.t. Ax+Bz =c
with variables x ∈ Rn and z ∈ Rm, where A ∈ Rp×n, B ∈ Rp×m, c ∈ Rp, and f : Rn → R and g : Rm → R are convex. A conventional way to solve
Problem (2.34) is via theLagrangian dual ascent method. The Lagrangian of Problem (2.34) can be written as
L(x,z,y) = f(x) +g(z) +yT(Ax+Bz−c) , (2.35)
and the Lagrangian dual function is
g(y) =inf
x,z L(x,z,y) , (2.36) wherey ∈ Rpis the dual variable or Lagrange multiplier. The dual problem becomes
max
y g(y). (2.37)
For convex problems, strong duality holds, i.e., the optimal values of the primal prob- lem (2.34) and dual problem (2.37) are the same. Under strict convexity, we can recover a primal optimal point(x∗,z∗)from a dual optimal pointy∗as
(x∗,z∗) = argmin
x,z
L(x,z,y∗) . (2.38) In the Lagrangian dual ascent method, the dual problem is solved using gradient ascent. The gradient ofg(y)w.r.t. ycan be computed as follows. First find(x∗,z∗) =
argminx,zL(x,z,y); then compute the gradient as ∇g = Ax∗+Bz∗−c, which is just the residual of the equality constraint. The dual ascent method works by iterating over the following steps:
(xk+1,zk+1) =argmin
x,z
L(x,z,yk) (2.39)
yk+1 =yk+αk(Axk+1+Bzk+1−c), (2.40)
where αk > 0 is a step size, and k is the iteration counter. Here, the Lagrangian is
minimized jointly w.r.t. xandz, and the dual function value is increased in each step via they-update.
Under some assumptions, including f and g being strictly convex and finite, the dual ascent method converges to an optimal solution. However, in practice, these assumptions do not always hold. For example, if f or gis a nonzero affine function, then the update step (2.39) fails because L is not bounded from below. So in many cases where these assumptions do not hold, the dual ascent method cannot be used.
2.4.1.2 Augmented Lagrangian
ADMM differs from the dual ascent method mostly in two ways: first, theaugmented Lagrangianis introduced with an additional quadratic penalty term to the Lagrangian;
second, the primal variables are not updated jointly, but in an alternating manner. Specifically, the augmented Lagrangian of Problem (2.34) is written as
Lρ(x,z,y) = f(x) +g(z) +yT(Ax+Bz−c) +ρ
2kAx+Bz−ck
2
2, (2.41) where ρ > 0 is the penalty parameter. Note that L0 (i.e., ρ = 0) is the standard
Lagrangian. The augmented Lagrangian can be thought of as the standard Lagrangian of the following problem
min x,z f(x) +g(z) + ρ 2kAx+Bz−ck 2 2 (2.42) s.t. Ax+Bz =c
It is easy to see that Problem (2.42) is equivalent to Problem (2.34), because any feasible x andz make the last term zero in the objective. The benefit of adding this augmented term is that this method has far better convergence properties and is able to handle cases when f or g are not strictly convex or unbounded from above (Boyd et al. [2011]).
With the augmented Lagrangian, the ADMM for solving Problem (2.34) consists of the following iterations:
xk+1 =argmin x Lρ(x,zk,yk) (2.43) zk+1 =argmin z Lρ(xk +1,z,yk) (2.44) yk+1 =yk+ρ(Ax+Bz−c). (2.45)
The iterations are stopped when certain stopping criteria are met, e.g., the primal and dual residuals are below a small threshold. This sequential update method is also called the inexact Augmented Lagrange Multipliers (ALM) method in (Lin et al. [2009]).
Recently, the ADMM has also been used to solve certain types of non-convex problems (Shen et al. [2014]; Zhang [2010]; Zeng et al. [2012]; Jia et al. [2015]), for example the problems with bilinear terms. This proves the practical usefulness of ADMM in handling non-convex problems, and recent works (Li and Pong [2015]; Wang et al. [2015]; Zhang et al. [2016]) have provided some theoretical analysis on it.
2.4.1.3 A Concrete Example
To get a comprehensive understanding of ADMM, we give a concrete example of using ADMM to solve a convex problem. We consider the optimization problem (2.23) for LRR (Liu et al. [2010]). For ease of solving the nuclear norm problem, we first
introduce an auxiliary variableZ
min
C,Z,E kCk∗+λkEk2,1 (2.46)
s.t. X=XZ+E,
C=Z .
Clearly, this problem is equivalent to problem (2.23). We now derive the aug- mented Lagrangian as Lρ(C,Z,E,Y1,Y2) =kCk∗+λkEk2,1+ tr[YT1(X−XZ−E)] +tr[YT2(C−Z)]+ ρ 2(kX−XZ−E)k 2 F+kC−Zk2F), (2.47) whereY1andY2are the Lagrange multipliers.
We then minimizeLρover one of the primal variablesC,ZandEwhile fixing the
others. In particular, the subproblem for updatingCis C∗ =argmin C 1 ρkCk∗+ 1 2kC−(Z−Y2/ρ)k 2 F , (2.48)
which can be solved in closed form via the singular value thresholding operator. ForZ, we simply take the derivative ofLρand set it to zero, which leads to
Z∗ = XTX+I−1
XT(X−E) +C+ (XTY1+Y2)/ρ, (2.49)
andEis updated by solving the subproblem E∗ =argmin E λ ρkEk2,1+ 1 2kE−(X−XZ+Y1/ρ)k 2 F , (2.50) which also leads to a closed form solution given in (Liu et al. [2010]).
After the primal variables are updated, the dual variablesY1andY2can be updated as
Y1 :=Y1+ρ(X−XZ−E) (2.51)
Y2 :=Y2+ρ(C−Z). (2.52)
If a varying penalty parameter ρ is used, then ρ is updated after all the primal
and dual variables are updated. These update steps are repeated sequentially until the ADMM algorithm converges.