• No se han encontrado resultados

CLASIFICACION DE ACUERDO A LA FUNCION

PARTE II SIMBOLOS GRAFICOS PARA USO EN EQUIPOS

SECCION 11 CLASIFICACION DE ACUERDO A LA FUNCION

In sections 6.5.1 and 6.5.2 above, I investigated the effects of responsiveness and toler- ance upon aggregated average total system scores and average total system scores when

Axelrod strategies that periodically defect are played against thus providing an answer to research question 4 from section 6.1. Whilst these results are important, the effect of tolerance and responsiveness upon individual scores should also be discussed since success at the macro level does not entail success at the micro level. This section will therefore provide answers to research question 5 from section 6.1.

I begin with a consideration of how responsiveness affects the average individual scores of emotional agents when playing against the Axelrod strategies that periodically defect. Table 6.6 presents the relevant results and offers a number of interesting ob- servations. Firstly, it can be seen how as responsiveness decreases, average individual scores for the emotional character agents considered, increases. The only exception to this trend is observed when emotional characters play against the joss agent. In this case, increasing responsiveness past moderate responsiveness does not have an effect. The trends related to average individual scores and responsiveness noted for emotional characters A3:G1, A3:G2 and A3:G3 also apply to emotional characters A1:G1-A1:G3 and A2:G1-A2:G3.

Table 6.6: Comparison of the average individual scores (and standard deviations) for initially cooperative emotional agents with characters A3:G1, A3:G2 and A3:G3 when

playing against Axelrod strategies that periodically defect.

Opponent

Emo. Ch. Random Tester Joss A3:G1 372.4 (20.33) 443 (0.00) 449.4 (74.00) A3:G2 417 (16.57) 487 (0.00) 239.4 (31.19) A3:G3 446 (17.78) 513 (0.00) 239.4 (31.19)

Decreased responsiveness causes average individual scores to increase since emotional characters that are less responsive are capable of punishing opponents for longer periods following activation of anger. In this way, periodic defectors who hope to distribute the sucker’s pay-off before re-establishing cooperation cycles (thus maximising both their individual score as well as the total system score) are punished more severely and the punisher recuperates its losses with interest. Furthermore, the quicker an agent is to re-establish cooperation, the more it can be taken advantage of. This form of “naivety” is disadvantageous from an individual standpoint but not from the system’s perspective. The average individual scores obtained by emotional characters A3:G1, A3:G2 and A3:G3 when playing against the joss agent are caused by a combination of the joss agent’s TFT behaviour and the decreased responsiveness of emotional characters A3:G2 and A3:G3. Since emotional character A3:G1 is quick to cooperate it quickly re-establishes CC with a joss agent following periodic defection whereas emotional characters A3:G2 and A3:G3 will establish defection cycles. This increases the average individual score of emotional character A3:G1 because CC earns the individual more per round than DD. To clarify why the scores for emotional characters A3:G2 and A3:G3 plateau, consider the play histories in table 6.7. In round n the joss agent periodically defects and acti- vates anger in all the emotional characters considered. This then causes the emotional

characters to defect in roundn+ 1 but, because all emotional characters cooperated in roundn, the joss agent cooperates in roundn+ 1 (since it uses TFT). This is where the emotional characters diverge: in roundn+2, emotional character A3:G1 will reciprocate the joss agent’s cooperation in roundn+ 1 due to its gratitude being activated whereas emotional character A3:G2 requires two cooperations (emotional character A3:G3 re- quires three). This establishes DD between the joss agent and emotional characters A3:G2 and A3:G3 since the joss agent reverts to TFT without activating gratitude in either of these agents and periodic cooperation is not possible. However, when playing against emotional character A3:G1, the joss agent will defect in round n+ 2 since it reciprocates A3:G1’s defection from round n+ 1. Yet, in roundn+ 3, both A3:G1 and the joss agent will re-establish CC since A3:G1 requires 3 defections to defect so the joss agent’s defection in roundn+ 2 has no immediate effect.

Table 6.7: How decreased responsiveness causes a reduction and plateau of average individual score when playing against agents that use TFT and periodically defect.

Round # A3:G1 Joss A3:G2 Joss A3:G3 Joss

n C D C D C D

n + 1 D C D C D C

n + 2 C D D D D D

n + 3 C C D D D D

Score 8 13 7 7 7 7

As stated, the trends observed for A3:G1, A3:G2 and A3:G3 also hold for A1:G1- A1:G3 and A2:G1-A2:G3 but the scores achieved are much lower. This is caused by the reduced tolerance of these emotional characters which, if the joss agent is primarily considered, causes DD to be locked into more quickly and lower individual scores to be obtained. It is the effect of tolerance upon individual scores that I will now turn my attention towards thus providing an in-depth discussion of how tolerance affects individual scores and why.

If the average individual scores for initially cooperative A1:G1, A2:G1 and A3:G1 are compared (see table 6.8) it can be seen that, in much the same way as responsiveness, as tolerance increases, emotional characters sacrifice individual score to maximise total system score. Also, as with responsiveness, the trend is reversed when these emotional characters play against the joss agent. However, the difference between tolerance and responsiveness is that increasing tolerance does not cause the individual score to plateau.

Table 6.8: Average individual scores(and standard deviations) of initially cooperative emotional characters A1:G1, A2:G1 and A3:G1 when playing against Axelrod strategies

that periodically defect.

Opponent

Emo. Ch. Random Tester Joss A1:G1 449 (16.58) 533 (0.00) 228.4 (19.07) A2:G1 398 (17.84) 465 (0.00) 417.2 (124.63) A3:G1 372.4 (20.33) 443 (0.00) 449.4 (74.00)

The reversal of the trend outlined when A1:G1, A2:G1 and A3:G1 play against the joss agent (see table 6.8), occurs because of the joss agent’s low likelihood of defection i.e. there is only a 10% chance that the agent will defect in any round. As tolerance increases, it is more likely that cooperation cycles will be maintained since the likelihood of the joss agent defecting twice consecutively is 0.01 (obviously, the likelihood of the joss agent defecting three times consecutively is even smaller; 0.001). So, the high tolerance of A3:G1 enables the highest likelihood of CC maintenance in the face of rare, one-off, periodic defections. In the case of A1:G1 however, periodic defection by the joss agent leads to an unending cycle of TFT play earning the emotional agent less over 2 rounds than it would if CC were re-established (5 instead of 6). This drawback of the TFT strategy is mentioned in section 2.4.3 of chapter 2.

I now consider scores that pertain to emotional characters A1:G1, A2:G1 and A3:G1 when pitted against the random agent to explain why the average individual score of each emotional character decreases as tolerance to defection increases. If an agent has a high probability of defecting (as the random agent does) then it is not unlikely (as it is for the joss agent) for the agent to defect a number of times consecutively. In such a situation, if an emotional agent is currently cooperating, the more tolerant the emotional agent is, the more often it will receive the sucker’s pay-off. Thus, even though cooperation is maintained, the individual score of the tolerant agent suffers.

So far it has been experimentally verified that, as tolerance and responsiveness in- crease, the total system score increases whilst an agent’s individual score decreases but how much of a trade-off between individual score and total system score occurs? To answer this I will consider tolerance; as tolerance increases, the difference between in- dividual scores decreases, to clarify: emotional characters A1:G1 and A2:G1 have an average individual score difference of 50.8 whereas the difference between these values for emotional characters A2:G1 and A3:G1 is almost halved to 25.8. Using the Prisoner’s Dilemma pay-off matrix (see table 2.1 in section 2.1.2 of chapter 2) the trade-off can be precisely calculated. For every point earned by the system in context of CC and DC or CD, an agent (in this case the emotional agent), must sacrifice 2 points with respect to its preferred individual score. This answer begs a further question: how much of a reduction in individual score is acceptable to achieve these system gains?

To determine when the trade-off between individual and system scores becomes unac- ceptable a number ofthreshold values must be calculated. Various maximal and minimal scores that can be achieved for or tolerated by each entity in the simulations needs to be considered to determine these thresholds. Table 6.9 identifies these values along with their method of calculation and maximum/minimum values:

The best possible score that any individual agent can achieve is 1000 whilst the worst is 0; achieved when a mendacious strategy plays against a veracious strategy (mendacious earns 1000, veracious earns 0). An individual score of 0 is the worst scenario possible for an agent yet, the lowestacceptable score from the point of view of a self-interested agent who always seeks to earn some pay-off is 200. A total system score of this amount can

Table 6.9: Threshold values present in the simulation with their method of calculation and maximum/minimum values.

Threshold

Value Calculation Max. Min.

Average Agent 1 Score (A1) A1 Individual Score 1000 0 Average Agent 2 Score (A2) A2 Individual Score 1000 0 Average System Score A1 + A2 1200 400 Average Fairness Score IF A1>A2 = A2/A1 ELSE A1/A2 1 0

only be achieved by two agents locking into DD for an entire game. The best possible total system score is 1200, achieved when two agents initially cooperate and maintain CC for a whole game. The worst total system score is achieved by two agents locking into a DD for a whole game, leading to a total system score of 400. Using a combination of the total system score and the individual scores of the agents in the simulation, a measure of fairness can be identified which ranges from 0 to 1. The closer the value is to 1 the more equal the two players’ scores are. It is this measure that I will now focus upon.

The results obtained have so far established that an initially cooperative A3:G1 agent is more successful than an initially cooperative A1:G1 agent (the emotional version of the TFT agent). The reason for this success is due to A3:G1’s increasedtolerance i.e. its ability to maintain cooperation since the responsiveness of emotional characters A1:G1 and A3:G1 are equal. However, the total system scores produced by A3:G1 agents are not fairly distributed when compared to the fairness measures for equally responsive and tolerant characters when playing against Axelrod strategies that periodically defect (see table 6.10). For example: when playing against a random agent the average fairness value obtained by an initially cooperative A3:G1 agent is 0.59, whereas for an initially cooperative A1:G1 agent fairness is equal to 1 and for an initially cooperative A3:G3 agent fairness is equal to 0.98. Despite this, the system total achieved by an initially cooperative A3:G1 agent is much higher than that achieved by its less tolerant and less responsive peers (as previously discussed). It is conceivable that more tolerant agents that are highly responsive will produce greater total system scores at the expense of fairness, but only until a certain point i.e. when their individual score falls below the threshold of 200. Below this individual score the trade-off becomes unacceptable for a self-interested agent since consistent defection achieves a greater individual score.

classes are not considered but it is worth mentioning that emotional characters A1:G1, A2:G2 and A3:G3 all achieve the highest fairness ratings in the majority of cases when playing against Axelrod strategies that periodically defect. This is due to these emo- tional characters all being variations of the TFT strategy: A1:G1 cooperates/defects after receiving 1 cooperation/defection, A2:G2 cooperates/defects after receiving 2 co- operations/defections and A3:G3 cooperates/defects after receiving 3 cooperations/de- fections. If the fairness scores for these agents are considered when playing against all Axelrod strategies that periodically defect then it can be observed that A1:G1 always achieves the highest fairness rating out of the three emotional characters whilst A3:G3 always achieves the worst. This is because, as tolerance and responsiveness decreases, the emotional agents will punish and reward more slowly, creating a greater rift between the scores of the two agents playing.

Table 6.10: Average fairness ratings for emotional character agents with both types of initial dispositions when playing against random, tester and joss agents.

Opponent Ini.Dis. Emo. Ch. Random Tester Joss

Coop A1:G1 1 1 0.98 A1:G2 0.67 0.5 1 A1:G3 0.53 0.48 1 A2:G1 0.69 0.73 0.87 A2:G2 0.99 0.99 0.97 A2:G3 0.80 0.81 0.97 A3:G1 0.59 0.66 0.86 A3:G2 0.79 0.81 0.96 A3:G3 0.98 0.98 0.96 Defect A1:G1 0.99 1 1 A1:G2 0.64 1 0.98 A1:G3 0.51 1 0.98 A2:G1 0.7 1 0.88 A2:G2 0.99 1 0.98 A2:G3 0.78 1 0.98 A3:G1 0.6 1 0.87 A3:G2 0.81 1 0.98 A3:G3 0.99 1 0.98

6.6

Chapter Summary

In this chapter I have attempted to model the emotions of anger and gratitude, jus- tify the modelling decisions made, implement the basic emotional characters that will be used in the rest of this thesis and construct and run a number of simulations us- ing agents endowed with these emotional characters in a bespoke environment. This bespoke environment pitted the emotional agents outlined against notable strategies from Axelrod’s famous computer tournament [7] to allow me to provide answers to the

questions of whether any emotional characters are more successful than the TFT strat- egy (noted as the most successful strategy in Axelrod’s tournament) and whether there are any interesting behavioural features associated with the base emotional characters proposed.

With respect to justification for the decision to model anger and gratitude, I believe that the case has been well made. Numerous pieces of literature were presented in section 6.2 that gave precise insights into the eliciting conditions, effects, probabilities of effect and intensity variables associated with anger and gratitude in the context of public goods games. From these pieces of literature I decided to implement these emotions so that anger would be elicited by, and would provoke defection in, an agent whilst gratitude would be elicited by and provoke cooperation. The effects of these emotions are guaranteed, as per the literature’s descriptions. Furthermore, the potentials for these emotions are altered in the same way: when a cooperation or a defection is received, the potential for gratitude/anger is increased by one.

The question of when these emotions are elicited and exert some effect upon the intentional behaviour of the agent is answered in section 6.3. Anger and gratitude have three values that their activation threshold values may be set to: 1, 2 and 3 (as de- scribed in section 5.4.1 of chapter 5) and as such, nine emotional characters may be implemented. These emotional characters can be grouped according to their respon- siveness and tolerance ratios and these ratios are the main focus of the investigation presented in the chapter.

In section 6.4 I gave an overview of the specifics of the simulation environment used to investigate the research questions listed in section 6.1 and outlined the order of play that the simulations follow. Section 6.5 then discussed some of the more interesting results obtained from running these simulations which provided an answer to the questions of whether or not any of the emotional characters implemented are more successful than the TFT agent when playing against other notable strategies from Axelrod’s tournament and why. Furthermore, I have also examined whether there were any interesting behavioural features associated with the base emotional characters proposed. The key results to note are listed below:

ˆ A1:G1 is the emotional analogue of the classical TFT agent due to its low tolerance and high responsiveness.

ˆ The initially cooperative A3:G1 agent offers a significant improvement with re- spect to total system score than Axelrod’s TFT agent for many reasonable initial configurations.

ˆ The reason for A3:G1’s success is its high responsiveness and tolerance. These qualities are especially important when playing against agents whose behaviour is uncertain such as the random, tester and joss agents.

ˆ Increased tolerance ensures that cooperation cycles are maintained and system scores are still promoted by players.

ˆ Increased responsiveness ensures that cooperation cycles are established quickly following defection from an opponent.

ˆ A highly tolerant and responsive agent’s individual score suffers since it may be taken advantage of more often and for longer periods of time.

ˆ An agent’s tolerance and responsiveness should never increase to the point where its individual score falls below 200, in this case DD is a better option as the agent will accrue at leastsome income.

ˆ For highly responsive emotional character agents, increased tolerance decreases the fairness of income distribution since agents are able to be exploited for longer periods.

In the next chapter I turn attention to modelling the emotion of “admiration” and an analysis of its effects upon proliferating emotional characters throughout a larger population of entirely emotional agents.

Admiration

In chapter 6, I experimentally verified that an initially cooperative A3:G1 agent is the most successful agent (with respect to total system score) out of the nine emotional characters and six Axelrod strategies implemented since it is both the most tolerant and the most responsive. Thus, this emotional character is slow to defect and quick to cooperate, qualities that facilitate both the establishment and maintenance of cooper- ation between itself and its opponent. In chapter 6, it was also shown that, whilst the increased tolerance of A3:G1 enables it to promote total system score most successfully, this benefit comes at a cost to individual score. The disparity between the individual scores of agents is termed as thefairness of the system. System fairness (as defined and calculated in this thesis, see section 6.5.3 and table 6.9 of chapter 6) ranges from 0 to 1, the closer to 0 the less fair the individual score distributions and vice-versa. Of its equally responsive counterparts (initially cooperative emotional characters A1:G1 and A2:G1), the initially cooperative A3:G1 agent achieved the lowest fairness scores when playing against agents that periodically defect (see table 6.10 in chapter 6).

So, given a population of emotional agents, would A3:G1 become the most prevalent emotional character if agents could directly adopt the emotional character of others? This question is the focus of this chapter and is primarily inspired by a consideration of Nowak and Roch’s work onspatial reciprocity [136] discussed in section 2.4.1.3 of chapter