• No se han encontrado resultados

1.21. Procesamiento estadístico

1.21.1. Relación de competencias

The reader should be aware that in this section we will override many of the nota- tions that were used to describe Lemke’s algorithm. In the future, whenever these notations are used the algorithm that is being referred to will always be clear from the context.

The Cottle-Dantzig algorithm begins with an arbitrary joint strategyσ0, and

it produces a sequence of joint strategies hσ0, σ1, . . . , σki. The process of moving fromσi toσi+1is called a major iteration. LetPi={v∈V : Balσi(v)≥0}denote the set of vertices with non-negative balance in σi. The key property of a major iteration is that Pi ⊂ Pi+1. This means that at least one vertex with a negative

balance inσi will have a non-negative balance inσi+1, and that every vertex with a

non-negative balance inσi still has a non-negative balance inσi+1. SinceP ⊂Pi+1

for every i, the algorithm can go through at most |V| iterations before finding a joint strategyσj for whichPj =V. By Corollary 2.9, we have that σj is an optimal strategy.

The bulk of this section will be dedicated to describing how a major iteration is carried out. A major iteration begins with a joint strategyσi. The algorithm then picks a vertex v with Balσi(v)<0 to be thedistinguished vertex. Throughout this

section, we will denote this vertex as d. The distinguished vertex will have the property Balσi+1(d) 0, and we will therefore have P

i ∪ {d} ⊆ Pi+1. Once a

distinguished vertex has been chosen, the algorithm then temporarily modifies the game, by adding a bonus to the edge chosen by σi+1 at d. This modification will

last only for the duration of the current major iteration.

Definition 4.8(Modified Game For The Cottle-Dantzig Algorithm). For a rational number w, a joint strategy σ, and a distinguished vertex d, we define the game Gw

to be the same asG but with a different reward on the edge chosen by σ atd. If σ chooses the left successor at d then the left reward function is defined, for every u

in V, by: rλw(u) =                rλ(u) +w if u=d andu∈VMax, rλ(u)w if u=d anduVMin, rλ(u) otherwise.

If σ chooses the right successor at dthen rρ modified in a similar manner.

The reward on the edge chosen byσ atvin the gameGw is denoted asrσw(v). For a joint strategyσ and a vertex v, the value and balance of v for the strategy σ in the game Gw are denoted as Valueσw(v) and Balσw(v), respectively.

To see why the game is modified in this way, it is useful to look at the balance ofdin the modified game. The properties that we describe will hold no matter who owns d, but for the purposes of demonstration we assume that d ∈ VMax. The

balance ofdforσi inGw is then:

Balσi

w(d) = Valueσwi(d)−(rwσi(v) +β·Valueσwi(σi(v))).

Sincerσi

w(v) =rσi(v)+w, we can see that Valueσwi(d) must increase aswis increased. It may also be the case that Valueσi

w(σi(v)) increases aswis increased, however due to discounting, it must increase at a slower rate than Valueσi

w(d). Therefore, asw is increased Balσi

w(d) will also increase.

The algorithm will use machinery that is similar to the methods employed by Lemke’s algorithm. In each major iteration the algorithm will produce a sequence of pairs h(σi = σi,0, w0),(σi,1, w1), . . . ,(σi,k, wk)i, with w0 < w1 <· · · < wk, which satisfies the following properties.

1. For every vertexvPi we have Balσwi,jj (v)≥0.

2. For every valuey > wj there is some vertex v∈Pi with Balwσi,jj (v)<0.

Much like in Lemke’s algorithm, the first property ensures correctness, by never allowing a vertex inPi to have a negative balance, and the second property ensures termination, by preventing the algorithm from considering the same joint strategy twice. The third property ensures that there is always some vertex v Pi in σi,j that can be switched to produceσi,j+1.

Each step of a major iteration begins with a joint strategy σi, and value wi−1, for whichσi satisfies the first property in Gwi−1. For σ0, we can use w−1 = 0

to obtain this property. Much like in Lemke’s algorithm, we want to compute the valuewi =wi−1+c that satisfies all of the properties. We therefore need to know

the rate at which the balance of a vertex increases as we increase c. For each joint strategyσ, we denote this as ∂wValueσw(u), with the understanding that:

Valueσw+c(v)−Valueσw(v) =∂wValueσw(v)·c.

The following proposition is a trivial consequence of the characterisation of Valueσ given by (4.2).

Proposition 4.9. Consider a vertexu and a joint strategy σ. Suppose thatv is the distinguished vertex. The rate of change ∂wValueσw(u) isDuσ(v).

Once again, we define the rate of change of the balance of a vertex v in a joint strategyσ to be ∂wBalσw(v), with the understanding that:

Balσw+c(v)−Balσw(v) =∂wBalσw(v)·c.

We can obtain an expression for ∂wBalσw(v) by substituting the result of Proposi- tion 4.9 into the definition of balance given in (2.12).

that σ chooses the edge with the bonus at d. The rate of change ∂wBalσw(u) is: ∂wBalσw(u) =       

∂wValueσw(u)−β·∂wValueσw(σ(u)) if u∈VMax, β·∂wValueσw(σ(u))−∂wValueσw(u) if u∈VMin.

Proof. The proof is very similar to the proof of Proposition 4.5. The proof of this proposition is simpler, because the bonus wis guaranteed to be on an edge chosen by σ. Therefore, we have rσw(v) = rσ(v) for every vertex v, and so the careful consideration ofrσw(v) in Proposition 4.5 does not need to be repeated.

With Propositions 4.9 and 4.10 in hand, the minimum ratio test from Lemke’s algorithm can be reused with very little modification. The proof of the following proposition is identical to the proof given for Proposition 4.6.

Proposition 4.11. Let σi,j be a joint strategy, and let wj−1 be a rational value.

Suppose that for every vertexv∈Pi we have Balwσi,jj−1(v)≥0. If we set:

wi=wi−1+ min{

Balσw(v)

−∂wBalσw(v)

: vPi and ∂wBalσw(v)<0},

then all of the following properties hold.

1. For every vertex vPi we haveBalwσi,jj (v)≥0.

2. For every value y > wj there is some vertex v∈Pi withBalwσi,jj (v)<0. 3. There is some vertexvPi with Balσwi,jj (v) = 0.

As with Lemke’s algorithm, it is possible that the algorithm could make a degenerate step, where wi+1 = wi. These cases will be discussed in Section 4.1.3, and for now we will assume that wi+1> wi.

Once wi has been computed, the algorithm then switches a vertex that is indifferent whenσi,j is played on Gwj. This produces the joint strategyσi,j+1 that

it finds a pair (σi,k, wk) for which Balσwi,kk (d)≥0. We defineσi+1 to beσi,k with the

vertex dswitched to the edge σi(d). We now argue that this correctly implements a major iteration.

Proposition 4.12. For every vertex vPi∪ {d} we have Balσi+1(v)≥0.

Proof. Since Balσi,k

wk−1(d) <0 and Bal

σi,k

wk (d) ≥ 0, we know that there is some value wk−1 < y < wk such that Bal

σi,k

y (d) = 0. Consider the joint strategy σi,k played in the game Gy. Since d is indifferent, switching it does not change the value of any vertex, and therefore Balσi+1

y (v)≥0 for every vertex v∈Pi∪ {d}. We must now argue that Balσi+1(v) 0 for every vertex v P

i∪ {d}. For every vertex v Pi, this holds because σi+1 does not use the edge to which the

bonus y has been attached. Therefore, the characterisation given by (4.2) implies that Valueσi+1

y (u) = Valueσi+1(u) for every vertexu. This implies that Balσyi+1(u) = Balσi+1(u) for every vertex u6=d.

The above reasoning does not hold for the vertex d, because d is the only vertex at whichrσi(d)6=rσi

y (d). However, if d∈VMax then:

Balσi y (d) = Valueσyi(v)−(ryσi(v) +β·Valueyσi(σi(v))) = Valueσi(v)(rσi(v) +y+β·Valueσi(σ i(v))) ≤Valueσi(v)(rσi(v) +β·Valueσi(σ i(v))) = Balσi(d). Therefore, Balσi

y (d) must be positive. The proof for the case where d ∈ VMin is

symmetric.

The Cottle-Dantzig algorithm for discounted games is shown as Algorithm 3. Note that in major iterationi, the algorithm only ever switches vertices inPi. There- fore, the algorithm can consider at most 2|Pi| joint strategies in major iteration i.

Algorithm 3Cottle-Dantzig(G, σ) i:= 0;σ0=σ; P0 :={v∈V : Balσ0(v)≥0} while P 6=V do j := 0; σi,0=σi;w−1:= 0; Pi :={v ∈V : Balσi(v)≥0} d:= Some vertex inV \Pi while Balσwi(d)<0 do wi+1:=wi+ min{− Balσ wi(u)

∂wBalσw(u) : u∈Pi} and ∂wBal

σ

w(u)<0}

σi,j :=σ[σ(v)/v] for some vertex v with Balσwi(v) = 0 j:=j+ 1

end while

σ :=σ[σ(d)/d]; i:=i+ 1 end while

over all major iterations is P|iV=0|−12i = 2|V|−1. Therefore, we have shown the following theorem.

Theorem 4.13. Algorithm 3 terminates, with the optimal joint strategy, after at most2|V|1 iterations.

Documento similar