• No se han encontrado resultados

The Prisoner’s Dilemma game is familiar to most people, even if they are not explicitly aware of its name, details or predictions. The game was first devised by Flood and Dresher whilst working for the RAND (Research and Development) corporation and was formalised by Tucker in 1950 with the Prisoner’s Dilemma name and prison sentence pay-offs [151]. The general rules of the game are simple: two players (who are not allowed contact with each other prior to playing) have the choice of two actions: cooperate or defect. In the formalised prison sentence version of the game, the players are suspected of a crime that they committed together. Each player is questioned in separation to each other and is asked if the other prisoner was involved in the crime. To stay quiet and not confess anything is termed as a “cooperation” (as the prisoner cooperates with his accomplice) whilst confessing that the other prisoner was involved in the crime is termed as a “defection” (as the prisoner defects against his accomplice).

Play occurs simultaneously over one round so neither player has a chance to react and reward/punish their opponent. There are four possible outcomes to the game as shown in table 2.1 and the ordering of these outcomes creates the dilemma of the game’s namesake. What is important about the game itself is that it is the joint action of the players that determines the final outcome for both. So, whilst an agent can act in its

Table 2.1: Prisoner’s Dilemma pay-off matrix.

Player i Co-op Defect Player j Co-op 3i, 3j 5i, 0j

Defect 0i, 5j 1i, 1j

own self-interest, the consequence of the individual’s actions upon the total score of the system may also be considered. Note that in table 2.1, the values represent the utility earned by an agent if a particular joint outcome occurs i.e. 5 is the best outcome for an agent whilst 0 is the worst.

From the individual point of view of player i, the preference ordering of outcomes according to a rational model of agency is: DiCj > CiCj > DiDj > CiDj. Therefore,

the best outcome for playeri is to defect whilst player j cooperates. However, if player j is also rational then j would also defect causing both players to achieve their third most preferred outcome. As can be seen, cooperation carries with it a risk of achieving the worst possible outcome. So in order for i to both maximise its individual score and safeguard against being exploited, it should always defect, no matter whatj does.

If on the other hand the total system score of each outcome is considered then, from the perspective of playeri, the preference orderings may be rearranged: CiCj >DiCj >

CiDj >DiDj. In this context, CC is the most preferred outcome as this yields a total

of 6 whilst DD is the worst outcome as a system total of 2 is achieved. The dilemma here is that if both players are rational, the best action to employ is to defect but as explained, this results in a sub-optimal outcome for both players both in an individual and a total system context. CC on the other hand produces a much better outcome for both the system and the players involved but, if you intend to cooperate, how do you ensure that your opponent will do the same and so ensure that you do not receive the worst possible outcome? This is the true dilemma that the players face.

There also exists an iterated version of the Prisoner’s Dilemma where it is possible for a player to reward/punish an opponent in a subsequent round for cooperation/defection in a previous round. If, however, the number of iterations is finite and both players are rational then each will expect the other to defect on the final round n as there can be no retaliation in subsequent rounds. Therefore, the last “real” round is n-1 and again if both players are rational then both players would be expected to defect on this round as both will have defected in round n regardless. This argument continues backwards until the initial round meaning that both agents will mutually defect for the entire game regardless of the presence of subsequent rounds. Such reasoning is termed as the “backwards induction paradox” and is outlined in [144]. The existence of this paradox is the primary argument for why DD occurs in the iterated version of the game. As stated however, the number of rounds must be known by the players in order for the

effect to be manifest. Consequently, the iterated version of the game is normally played over a number of rounds not known to the players.

Documento similar