• No se han encontrado resultados

The two prize problem is an instance of the CPCP with exactly two prizes {i1 , i2} .1 We assume that >. = 00 and consequently we have a two-person constant-sum game. The case where >. < 00 leads to a two-person non-cooperative general-sum game and is discussed in Section 5.4.

Without loss of generality, suppose the players are in the same half plane relative to the prizes,

dAil � dBil and that if dAil = dEiI then dAi2 � dEi2 . We also suppose that IdAil - dAi2 1 =I di} i2 '

i.e., player A is neither collinear with nor between the prizes.2

We use the notation "A-o i" to indicate that player A takes one step, of size �, towards prize i and "A-o X" to indicate that player A move to location X such that dAX � �. The notation "A [> i1" stands for the hypothesis that player A is committed to travel all the way to prize i1 .

A move A -0 X is critical if otherwise there is no possibility that player A can successfully

achieve its current guaranteed value. Hence the current decision is a "knife-edge" decision.

5. 1 . 1 Static Cases

The following cases (i)-(iv) are the static cases for which there exists a saddle point equilibrium in which neither player requires a contingent strategy. We say that the static cases are guarantee determined.

Case (i) : dAi} < dEi} and dAi2 > dEi2 . Figure 5.1 shows the possible locations of player A satisfying case (i) given the locations of player B and the two prizes. Player A can guarantee value Vi} since prize il is guaranteed to player A, and player B can guarantee value Vi2 since prize i2 is guaranteed to player 8. Hence A-o il and B-o i2 is a saddle point equilibrium.

lWe use {i1 , i2 } rather than { 1 , 2} as the prize labels since later we apply the two prize problem relative to a larger prize set.

Case (ii): dAil = dEi I and dAi2 = dEi2 ' Without loss of generality we may assume that

ViI 2:: Vi2 '

Suppose ViI

>

Vi2 ' If A [> il then the resulting outcome is at least H ViI + Vi2 ) to player A.

However, A-o il is critical for player A since otherwise if B-o il then the resulting outcome is

Vi2 < H ViI + Vi2)' Similarly B-o il is critical for player B. Hence A -0 il and B-o il is a saddle

point equilibrium in which the players share the sequence il -+i2 , for a reward of HVil + Vi2) to

each player.

Suppose ViI = Vi2 ' Any pair of moves which target a prize is a saddle point equilibrium

in which the players either share the sequence il -+i2 or claim one prize each, for a reward of

ViI = Vi2 = ! (ViI + Vi2 ) to each player.

In both cases a saddle point equilibrium is A-o argmax{vil > vi2 } and B-o argmax{vil > Vi2 } ' for a reward of HVil + Vi2 ) to each player.

Case (iii): dAl < dEl and dA2 = dB2' Suppose ViI 2:: Vi2 ' If A [> il then the resulting outcome

is at least ViI to player A. If B [> i2 then the resulting outcome is at least Vi2 to player B. Hence A-o il and B-o i2 is a saddle point equilibrium in which the player A eventually claims prize il for reward ViI and player B eventually claims prize i2 for reward Vi2 '

Suppose ViI < Vi2 • If A [> i2 then the resulting outcome is at least ! (ViI + Vi2 ) to player A.

However, A -0 i2 is critical for player A since otherwise if B-o i2 then the resulting outcome is

ViI < HVil + Vi2)' Similarly B-o i2 is critical for player B. Hence A-o il and B-o il is a Nash

equilibrium in which the players share the sequence il -+i2 , for a reward of

t

(ViI + Vi2 ) to each

player.

In both cases a saddle point equilibrium is A -0 argmax{ VI , V2} and B-o i2 , for a reward of

max{i{vl + v2) , vd to player A and min{i{vl + V2), V2} to player B.

Case (iv): dAil + dili2 < dEi2 or dAi2 + dil i2 < dEiI ' Figure 5.1 also shows the possible locations of player A satisfying case (iv) given the locations of player B and the two prizes. Player A can guarantee value ViI + Vi2 since either A [> (il -+i2) or A [> {i2-+id is guaranteed. 5 . 1 .2 The Two Prize Theorem

The cases which are not static that remain are characterised by constraints (5.1)-{5.4) and are called the dynamic, cases since there is no optimal strategy for either player that does not involve playing contingently upon the observed movements of the opponent.

dAil < dBil (5.1)

dAi2 < dBi2 (5.2)

dAil + dili2

>

dBi2 (5.3)

dAi2 + dili2

>

dBil (5.4)

Constraints (5.1)-{5.2) are known as the prize constraints and constraints (5.3)-{5.4) as the

pair constraints. Player A is known as the "A-head" player and player B is known as the "B­

hind" player. Figure 5.2 shows the possible locations of player A satisfying constraints (5.1)-{5.4) given the locations of player B and the two prizes.

94 Optimal and Heuristic Analysis of Tiny Problems

B

... ... . .. ... . . .... . ... . . .. .... ... .. ... . .. . .... .. ..

Figure 5.2: Dynamic Case of the Two Prize Problem

Theorem 5 . 1 . 1 (Two Prize Theorem)

Suppose the constraints (5.3)-(5.4) hold at some time to with player A located at A

and player B located at B. Player A declares that it will arrive at location X at time

tx � to + dAX . Then 3 a location Y such that

dAX

>

dBY dXil + �li2

>

dYi2 dXi2 + dili2

>

dYil

(5.5) (5.6) (5.7)

and, hence, if player B moves directly to location Y, then the pair constraints also hold at time t x . In particular, if X is the location of prize i } , then 3 a corresponding Y

located on the line segment between B and prize i2, and if X is the location of prize i2,

then 3 a corresponding Y located on the line segment between B and prize i1 .

Proof:

If dBil � dx i2 + dil i2 or dEi2 � dAX then move player B a distance max{ dAx , dEi2 } directly

towards prize i2• Alternatively, if dEi2 � dXil + dili2 of dEiI � dAX then move player B a

distance max{ dAX , dEiI } directly towards prize i1 . Henceforth, suppose dBil

>

dAX , dBi2

>

dAX ,

dBil

>

dXi2 + dil i2 and dBh

>

dXil + di,h ·

From the perspective of location A, let WAil be the bearing to prize i1 and let WAi2 be the

bearing to prize i2. From the perspective of location B, let ""Bil be the bearing to prize il and

let WBi2 be the bearing to prize i2• As in Figure 5.3, let BA be the acute angle between WAil and

""Ai" and let BB be the acute angle between WBil and ""Ei2 •

Without loss of generality, we may assume that the location X is on or inside the triangle

6Ai1i2 , since this can only make constraints (5.6)-(5.7) more restrictive. From the perspective of location X, let ""Xi, be the bearing to prize i1 • Let BXi, be the acute angle between WXil and

>,( 8

:' , , , , , , A

X ,'-....;

.... \. ::'SB!

, , I I ,/ e / ,lA . . . , . . . , . . . . . ' . : · · · · , , . , · ' · , · , . .

Figure 5.3: Angles (}A and OB ·

We now construct the required location Y in each of the following cases.

(I). Suppose

B

B :5

BA'

Let tPy be the bearing which is an angle �BXil from tPBi2 towards

tPBiI via the acute angle between the tPBi2 and tPBiI .

• If the distance along bearing tPy from

B

to the line through the prizes is no greater

than

dAX,

then let Y be the intersection of bearing tPy with this line through the prizes. Then Y satisfies constraints (5.5)-(5.7)

• Move player B to a location Y, along bearing tPy a distance

dAX

(but no further than

the line between the prizes). Player B travels at least as close in deviation of bearing from the bearing of each prize for the same distance as does player A from a location farther from both prizes. Hence Y satisfies constraints (5.5)-(5.7).

(II). Consider the location

H

such that

dH

iI =

dA

i2

+

dil i2 and dHi2 =

dA

il

+

dil i2 ' Applying

the Cosine Law to

�Aili2

and

�Hili2:

d�2

=

d�l + d�2 - 2dA1dA2

cos

BA

(5.8)

d�2

=

d�l1 + d�12 - 2dHldH2 cosBH

=

(dA2 + d12)2 + (dAl

+

d12)2 - 2(dA2 + d12)(dAl + d12) COSBH

=

d�l + d�2 - 2dA1dA2 COSOH+

Documento similar