Conditional cooperation and emotional profiles

June 27, 2013 by Keven Poulin

I haven’t been delving into evolutionary game theory and agent-based modeling for very long, and yet I find that in that little time something quite eerie happens once I’m immersed in these models and simulations: I find myself oscillating between two diametrically opposed points of view. As I watch all of these little agents play their games using some all-too-simplistic strategy, I feel like a small God*. I watch cooperators cooperate, and defectors defect oblivious to what’s in their best interest at the moment. Of course, in the end, my heart goes out to the cooperators, who unfortunately can’t understand that they are being exploited by the defectors. That is what pushes me at the other end of the spectrum of omniscience, and with a nudge of empathy I find myself trying to be a simpleton agent in my over-simplified world.

In that state of mind, I begin to wonder what information exists in the environment, in particular information about the agents I am going to play against. I suppose I’m able to access it and use it to condition my move. Admittedly, that makes me a bit more complex than my original simpleton, and that complexity is likely to come at a cost, but I leave it to evolution to figure out whether the trade-off is worthwhile.



An early example of this kind of conditional strategy comes from the hypothesis that a trait could have evolved that allowed individuals to recognize their kin and thus those with whom cooperation is preferable. This idea dates all the way back to Hamilton (1964) but was popularized by Dawkins (1976) as the memorable “Green Beard Effect”. In the EGT literature this concept was captured by the introduction of tags: agents are given arbitrary tags that vary discretely (e.g. Jansen & van Baalen, 2006) or on a continuum (e.g. Riolo, Cohen & Axelrod, 2001). How to use this information is straight-forward: cooperate with agents of the same tag, and defect otherwise (although traitors could also be allowed). This mechanism assumes not only perception of other’s attributes but also knowledge of one’s own attribute. In real life, a tag could be anything from cell surface adhesion proteins (Queller, Ponte, Bozzaro & Strassmann, 2003) to shirt colour. Dawkins (as cited in van Baalen & Jansen, 2003) was actually skeptical that such a mechanism would be effective in promoting altruism because of its vulnerability to cheating, but Jansen & van Baalen (2006) shows that a fluctuating pattern of coexisting beard colours (or “beard chromodynamics” if you’re into fancy words) could maintain altruism even in weakly structured populations. Artem’s inviscid model in Bifurcation and Cooperation is thought to exhibit this kind of fluctuation (though additional work is needed to confirm this).

Szolnoki & Perc (2012) apply conditional strategies to a spatial public goods game (which is dynamically equivalent to PD with self-interactions). Agents play in groups and can either contribute a fixed amount of resource (cooperate) or not contribute anything (defect). Contributions are then summed up, multiplied and redistributed equally among players regardless of whether they contributed or not. In Szolnoki & Perc’s model, agents decide to cooperate according to how many cooperators there are in the group, and their strategy is defined by the number of cooperators they require to decide to cooperate. If an agent’s number is 0 then it is an unconditional cooperator; if it is higher than the size of the group, then it is an unconditional defector. Every agent in between are conditional cooperators. The authors find that in this scheme, the harshest conditional cooperators (those who cooperate only if they are in a group without any defectors), can isolate defectors and drive them to extinction (in a viscous environment), and thus allow cooperation to settle.

If you’re unsure about how to play, then you may use a fixed strategy, but try to choose whom you play against. This idea was originally implemented using a partner preference based on tags. More recently, Brede (2011) let his agents choose partners based on their success as determined by game interaction pay-offs. It seems reasonable to assume here that the author refers to pay-offs from the previous round of play, but an explicit description of this (and also of what happens on the very first round) would make replication more straight-forward. At each cycle, agents play as many times as they have neighbours, and the choice at each time is determined probabilistically based on how much more (or less) successful the candidates are. Hence, while some neighbours are not chosen at all, others will be played against more than once. Important to note here is the fact that this decision is unilateral such that once an agent is chosen, it has to play and its payoff will also include outcomes of games it did not choose. Making no prior assumption on the nature and strength of the bias, they found that choosing stronger partners was the strongest, most stable strategy in the long run.

If you can’t choose who you play against, then remembering your partner’s previous moves is the next best thing. That’s what agents did in the well-documented iterated Prisoner’s Dilemma. In Vukov, Santos & Pacheco (2012) however, agents can only remember the last move and their strategy is defined by a profile (p,q), where p is the probability to cooperate given past cooperation (“mutualism”), and q, the probability to cooperate given past defection (“forgiveness”). (1-p) and (1-q) are interpreted as “treason” and “retaliation”, respectively. When allowing these “reactive” strategies to compete against defectors and unconditional cooperators in heterogeneous networks, they find reactive strategies thrive and maintain high levels of cooperation, while unconditional cooperators die out. The authors also manipulate the payoff matrix and use U and temptation V as measures of “fear” and “greed”, respectively. A quick look at the UV-space shows that their manipulations (U in [0,-1] and V in [1,2]) keeps the game within the bounds of the PD.

Using such words as forgiveness, greed, and fear, already gives our simpleton agents some personality; the temptation to anthropomorphize simulated agents is strong. In this trend, Szolnoki, Xie, Wang, & Perc (2011) and Szolnoki, Xie, & Perc (2013) went all the way and gave their agents “emotional profiles”! True, this choice of words serves some purpose: it makes talking about the model a bit easier, more accessible to intuition. But this comes at a very high risk of letting that intuition run wild and draw conclusions that go (way) beyond the data. A greater risk still is to cloud real, relevant results with oversimplifications of concepts we barely understand. This in turn sends a message about the level of rigor used in this still-emerging field. Therefore, extra caution and self-monitoring is in order when introducing human emotions and the like as metaphors.

In any case, the idea in Szolnoki et al. (2011,2013) is interesting: an agent’s profile is defined by (a,b), where a is the probability of cooperating when playing with a less successful agent, and b, the probability of cooperating when playing with a more successful partner (thereby assuming agents have access to each other’s payoff). Hence, a high a is akin to “sympathy”, and a low b, akin to “envy” (low b). In their paper, the authors claim that agents do not pass on strategies but emotional profiles, but this is, again, playing with words: an emotional profile is a strategy in the larger sense. Their results show that the homogeneity or heterogeneity of the interaction network will favour different profiles and allow them to take over the population. This hints at the importance of network attributes which are another type of information that exists in these artificial worlds and that could be used by agents…

… and so Chiang (2013) did just that. He allows agents to condition their strategy on agents’ nodal attributes in the interaction graph. The attributes explored are degree, clustering coefficient, betweenness, and eigenvector centrality. If you read this, you did not fall into a Wikipedia vortex; you are a model of self-discipline. Congratulations, but back to business… When playing the game, an agent will look at their partner’s attribute and look at the difference with its own and then apply its strategy, which is defined by an interval within which this difference must fall for there to be cooperation. The interval is flexible and can favour cooperation with agents of lower, higher, similar, or even all attributes. The interval can also impede cooperation altogether (the special case where the upper bound is lower than the lower bound), thus making the agent an unconditional defector. What they find is that regardless of attributes, two strategies are most successful: cooperate only with higher attribute and cooperate only with lower attribute. Hence, strategy intervals containing 0 (cooperation with identical attribute), did not fare well. Obviously, network architecture is very important the effects of different topologies are reported.

Notice the recurrent theme in the last three models of allowing agents to condition while making no prior assumption as to which way this condition will sway even if intuitively, a specific strategy seems optimal. This is good practice because although a strategy may do well on its own against defectors, it is possible that other less optimal strategies impede it from gaining the momentum it needs to take over. Such a dynamic is observed in Szolnoki & Perc (2012).

Another consideration I’ve alluded to and which every author cited here makes a point of addressing is the plausibility for an agent to have access to the information it is supposed to condition on. This is all the more important if you are interested in the possibility of deception. Indeed, anything about another player an agent can condition on can be interpreted as a signal originating from the other player and reaching the agent. If there is the possibility for the signal to be misrepresented to elicit cooperation while intending to defect, deception will occur. If deception becomes the norm then the cue agents were conditioning on becomes worthless and they’d sooner look for another one. This problem is at the heart of animal signal theory, a field rich in insights about the evolution of communication and even language. Giving agents the ability to manipulate signals (or allowing that signal to mutate) may be seen as an added layer of complexity, but it may also just be fair: after all, we just gave agents perfect perception of partners’ attributes.

As an agent, I wish I’d have it as easy as possible, but as a mini-God peering over the simulation, I want to be surprised and see cooperation emerge when it’s not so obvious… Hmmm…

*Check out Fassbinder’s World on a Wire for an early treatment of this theme.

References

Brede, M. (2011) Playing against the fittest: A simple strategy that promotes the emergence of cooperation. EPL. 94, 30003.

Chiang, Y.-S. (2013). Cooperation could evolve in complex networks when activated conditionally on network characteristics. Journal of Artificial Societies and Social Simulation, 16 (2)

Dawkins, R. (1976). The selfish gene. Oxford University Press.

Hamilton, W. D. (1964). The genetical evolution of social behaviour. I and II. J. Theor. Biol. 7, 1–16 and 17–52.

Jansen, V. A. A., van Baalen, M. (2006). Altruism through beard chromodynamics. Nature. 440, 663-666.

Queller, D. C., Ponte, E., Bozzaro, S. & Strassmann, J. E. (2003). Single-gene greenbeard effects in the social amoeba Dictyostelium discoideum. Science. 299, 105–-106.

Riolo, R. L., Cohen, M. D., Axelrod, R. (2001). Evolution of cooperation without reciprocity. Nature. 414, 441-443.

Szolnoki, A. & Perc, M. (2012). Conditional strategies and the evolution of cooperation in spatial public goods games. Physical Review E. 85, 026104.

Szolnoki, A., Xie, N.-G., Wang, C., and Perc, M. (2011). Imitating emotions instead of strategies in spatial games elevates social welfare. EPL, 96, 38002.

Szolnoki, A., Xie, N.-G., Ye, Y., and Perc, M. (2013). Evolution of emotions on networks leads to the evolution of cooperation in social dilemmas. EPL. 96, 38002.

Van Baalen, M. & Jansen, A. A. (2003). Common language or Tower of Babel? On the evolutionary dynamics of signals and their meanings. Proc. R. Soc. Lon. B. 270, 69-76.

Vukov, J., Santos, F. C., Pacheco, J. M. (2012). Cognitive strategies take advantage of the cooperative potential of heterogeneous networks. New Journal of Physics. 14, 063031.