• No se han encontrado resultados

Capítulo I Disposiciones reglamentarias

Artículo 1.- Obligaciones generales del empleador

In this section, we show the principles behind automatically determining the necessary dimensionality of a given problem. This the procedure applies to bilinear programs that minimize L1norm, but also works for general bilinear programs.

The dimensionality is inherently part of the model, not the problem itself. There may be equivalent models of a given problem with very different dimensionality. Thus, proce- dures for reducing the dimensionality are not necessary when the modeler can create a model with minimal dimensionality. However, this is nontrivial in many cases. In addi- tion, some dimensions may have little impact on the overall performance. To determine which ones can be discarded, we need a measure of their contribution that can be com- puted efficiently. We define these notions more formally later in this section.

We assume that the feasible sets have bounded L2 norms, and assume a general for- mulation of the bilinear program, not necessarily in the semi-compact form. Given As- sumption 8.7, this can be achieved by scaling the constraints when the feasible region is bounded.

Assumption 8.3. For all x∈ X and y∈Y, their norms satisfykxk2≤1 andkyk2≤1.

We discuss the implications of and problems with this assumption after presentingTheo- rem 8.4. Intuitively, the dimensionality reduction removes those dimensions where g(y) is constant, or almost constant. Interestingly, these dimensions may be recovered based on the eigenvectors and eigenvalues of CTC. We use the eigenvectors of CTC instead of the eigenvectors of C, because our analysis is based on L2norm of x and y and thus of C. The L2normkCk2is bounded by the largest eigenvalue of CTC. In addition, a symmetric matrix is required to ensure that the eigenvectors are perpendicular and span the whole space.

Given a problem represented using (8.1), let F be a matrix whose columns are all the eigen- vectors of CTC with eigenvalues greater than some ¯λ. Let G be a matrix with all the re- maining eigenvectors as columns. Notice that together, the columns of the matrices span the whole space and are real-valued, since CTC is a symmetric matrix. Assume without loss of generality that the eigenvectors are unitary. The compressed version of the bilinear program is then the following:

max w,x,y1,y2,z ˜f(w, x, y1, y2, z) =rT1x+sT2w+xTCFy1+rT2  F G     y1 y2   +s T 2z s.t. A1x+B1w= b A2  F G     y1 y2   +B2z= b2 w, x, y1, y2, z≥0 (8.5)

Notice that the program is missing the element xTCGy2, which would make its optimal solutions identical to the optimal solutions of (8.1). A more practical approach to reducing the dimensionality would be based on singular vector decomposition. This approach is based on singular value decomposition and may be directly applied to any bilinear pro- gram. The following theorem quantifies the maximum error when using the compressed program.

ˆy

Y

kyk2 ≤1

Figure 8.1.Approximation of the feasible set Y according toAssumption 8.3.

Theorem 8.4. Let f∗and ˜f∗be optimal solutions of (8.1) and (8.5) respectively. Then:

e=|f∗− ˜f∗| ≤

¯λ.

Moreover, this is the maximal linear dimensionality reduction possible with this error without con- sidering the constraint structure.

The proof of the theorem can be found inSection C.9.

Alternatively, the bound can be proved by replacing the equality A1x+B1w = b1 by kxk2 = 1. The bound can then be obtained by Lagrange necessary optimality conditions. In these bounds we use L2-norm; an extension to a different norm is not straightforward. Note also that this dimensionality reduction technique ignores the constraint structure. When the constraints have some special structure, it might be possible to obtain an even tighter bound. As described in the next section, the dimensionality reduction technique generalizes the reduction that Becker, Zilberstein, Lesser, and Goldman (2004) used implic- itly.

The result ofTheorem 8.4is based on an approximation of the feasible set Y bykyk2≤1, as

Assumption 8.3states. This approximation may be quite loose in some problems, which

may lead to a significant multiplicative overestimation of the bound in Theorem 8.4. For example, consider the feasible set depicted in Figure 8.1. The bound may be achieved

in a point ˆy, which is far from the feasible region. In specific problems, a tighter bound could be obtained by either appropriately scaling the constraints, or using a weighted L2 with a better precision. We partially address this issue by considering the structure of the constraints. To derive this, consider the following linear program and corresponding theorem: max x c Tx s.t. Ax=b x0 (8.6)

Theorem 8.5. The optimal solution of (8.6) is the same as when the objective function is modified to

cT(IATAAT−1A)x. Proof. The objective function is:

max {x Ax=b, x0}c Tx = = max {x Ax=b, x0}c T( I−ATAAT−1A)x+cTATAAT−1Ax = cTATAAT−1b+ max {x Ax=b, x0}c T(I −ATAAT−1A)x.

The first term may be ignored because it does not depend on the solution x.

The following corollary shows how the above theorem can be used to strengthen the di- mensionality reduction bound. For example, in zero-sum games, this stronger dimension- ality reduction splits the bilinear program into two linear programs.

Corollary 8.6. Assume that there are no variables w and z in(8.1). Let:

Qi = (I−ATi 

AiATi −1

Ai)), i∈ {1, 2},

where Aiare defined in (8.1). Let ˜C be:

˜

where C is the bilinear-term matrix from (8.1). Then the bilinear programs will have identical optimal solutions with either C or ˜C.

Proof. UsingTheorem 8.5, we can modify the original objective function in (8.1) to:

f(x, y) =rT1x+xT(I−AT1 A1AT1 −1

A1))C(I−AT2 

A2AT2−1A2))y+rT2y.

For the sake of simplicity we ignore the variables w and z, which do not influence the bilinear term. Because both (IAT

i AiATi −1

Ai)for i = 1, 2 are orthogonal projection matrices, none of the eigenvalues inTheorem 8.4will increase.

The bilinear program that minimizes the expected policy loss:

min π λ,v πTαTv s.t. =1 Avb0 π0 λ=Av−b v∈ M (ABP–U)

Although this bilinear program represents a minimization, this is immaterial to the di- mensionality reduction procedure. The formulation above uses the approximation only conceptually by requiring that v, λ ∈ M; to reduce the dimensionality the representation must be used explicitly as:

min π λ,x π TU(AΦx −b)αTΦx s.t. =1 AΦxb0 π0 kxk2 ≤ψ (8.7)

Here, v= Φx. The matrix C = UAΦ has at most m non-zero eigenvectors — where m is the number of features — because its rank is at most m. UsingTheorem 8.4, there exists an identical bilinear program with dimensionality m.

The dimensionality reduction proposed in this chapter requires that the feasible regions are bounded. This is the main reason why the bilinear program in (8.7) includes regular- ization constraints. While the dimensionality reduction in this chapter does require that the feasible sets have a bounded L2norm, this is only necessary for excluding dimensions with non-zero eigenvalue. It is also possible to solve the approximate bilinear program with a large number of features, when the regularization can be used to eliminate dimen- sions with non-zero eigenvalues. The algorithm for solving bilinear programs described in this chapter can be practical for ABPs with up to about 50 features.

Documento similar