Conditional cooperation with longer memory

Nikoleta E. Glynatsi; Martin A. Nowak; Christian Hilbe

Conditional cooperation with longer memory

Nikoleta E. Glynatsi, Martin A. Nowak, Christian Hilbe

Abstract

Direct reciprocity is a wide-spread mechanism for evolution of cooperation. In repeated interactions, players can condition their behavior on previous outcomes. A well known approach is given by reactive strategies, which respond to the co-player's previous move. Here we extend reactive strategies to longer memories. A reactive-$n$ strategy takes into account the sequence of the last $n$ moves of the co-player. A reactive-$n$ counting strategy records how often the co-player has cooperated during the last $n$ rounds. We derive an algorithm to identify all partner strategies among reactive-$n$ strategies. We give explicit conditions for all partner strategies among reactive-2, reactive-3 strategies, and reactive-$n$ counting strategies. Partner strategies are those that ensure mutual cooperation without exploitation. We perform evolutionary simulations and find that longer memory increases the average cooperation rate for reactive-$n$ strategies but not for reactive counting strategies. Paying attention to the sequence of moves is necessary for reaping the advantages of longer memory.

Conditional cooperation with longer memory

Abstract

strategy takes into account the sequence of the last

moves of the co-player. A reactive-

counting strategy records how often the co-player has cooperated during the last

rounds. We derive an algorithm to identify all partner strategies among reactive-

strategies. We give explicit conditions for all partner strategies among reactive-2, reactive-3 strategies, and reactive-

counting strategies. Partner strategies are those that ensure mutual cooperation without exploitation. We perform evolutionary simulations and find that longer memory increases the average cooperation rate for reactive-

strategies but not for reactive counting strategies. Paying attention to the sequence of moves is necessary for reaping the advantages of longer memory.

Paper Structure (5 sections, 4 equations, 4 figures)

This paper contains 5 sections, 4 equations, 4 figures.

Introduction
Results
Discussion
Data, Materials, and Software Availability

Figures (4)

Figure 1: The repeated prisoner's dilemma among players with finite memory.A, In the repeated prisoner's dilemma, in each round two players independently decide whether to cooperate ($C$) or to defect ($D$). B, When players adopt memory-1 strategies, their decisions depend on the entire outcome of the previous round. That is, they consider both their own and the co-player's previous action. C, When players adopt a reactive-$n$ strategy, they make their decisions based on the co-player's actions during the past $n$ rounds. D, A self-reactive-$n$ strategy is contingent on the player's own actions during the past $n$ rounds. E, To illustrate these concepts, we show a game between a player with a reactive-$1$ strategy (top) and an arbitrary player (bottom). Reactive-1 strategies can be represented as a vector $\mathbf{p} \!=\! (p_C, p_D)$. The entry $p_C$ is the probability of cooperating given the co-player cooperated in the previous round. The entry $p_D$ is the cooperation probability after the co-player defected. E, Now, the top player adopts a self-reactive-1 strategy, $\mathbf{\tilde{p}}\!=\!(\tilde{p}_C, \tilde{p}_D)$. Here, the bottom player's cooperation probabilities depend on their own previous action.
Figure 2: Characterizing the partners among the reactive-$n$ strategies.A,B, To characterize the reactive-$n$ partner strategies, we prove the following result. Suppose the focal player adopts a reactive-$n$ strategy. Then, for any strategy of the opponent (with arbitrary memory), one can find an associated self-reactive-$n$ strategy that yields the same payoffs. Here, we show an example where player 1 uses a reactive-1 strategy against player 2 with a memory-1 strategy. Our result implies that can switch to a well-defined self-reactive-1 strategy. This switch leaves the outcome distribution unchanged. In both cases, players are equally likely to experience mutual cooperation, unilateral cooperation, or mutual defection in the long run. C, Based on this insight, we can explicitly characterize the reactive-2 partner strategies (with $p_{CC}\!=\!1$). Here, we represent the corresponding conditions \ref{['eq:two_bit_conditions']} for a donation game with $b/c\!=\!2$. Among the reactive-2 strategies, the counting strategies correspond to the subset with $p_{CD}\!=\!p_{DC}$. Counting strategies only depend on how often the co-player cooperated in the past, not on the timing of cooperation. D, Similarly, we can also characterize the reactive-2 partner strategies for the general prisoner's dilemma. Here, we use the values of Axelrod axelrod:AAAS:1981.
Figure 3: Conditions for partners among reactive-$2$ and reactive-$3$ strategies.A, Within the set of pure self-reactive strategies, there are certain strategies where the way they behave can be described as playing a unique sequence. Since their action does not depend on the co-player's history, the strategy plays this sequence indefinitely. For example, in the case of $n=2$, the pure self-reactive strategy $\mathbf{\tilde{p}}=(0, 1)$, which alternates, can be described as playing the sequence $(D, C)$. B, We have proven that to characterize reactive-$n$ partner strategies, one only needs to check deviations towards self-reactive-$n$ strategies. Thus, for a nice reactive strategy $p$ to be a partner, it is necessary that none of these sequence-playing self-reactive strategies can achieve a higher payoff against $p$ than $p$ does against itself. The conditions of partner strategies, for $n=2$, and $n=3$, respectively for each of the sequences in panel A, are shown in panel B. These conditions are necessary, but furthermore, we have shown that these are also sufficient conditions (see Supporting Information). C, To derive the conditions, we need to consider the payoff that a sequence player achieves against a reactive strategy. In the top panel of panel C, we illustrate an example for $n=2$, against $\mathbf{p} = (1, p_{CD}, p_{DC}, p_{DD})$ and the sequence $(D, C)$. In the third turn, the sequence player receives a benefit $b$ with a probability of $p_{DC}$, and no cost since the sequence player did not cooperate. In the fourth turn, the player receives $p_{DC} \cdot b - c$, and thereafter these two payoffs are repeated forever. Thus, the total payoff of the sequence player with memory two is given by what they receive every two turns, which is $p_{DC} \cdot b - c$. This payoff needs to be smaller or equal than what a partner strategy achieves against another nice strategy, which is $2(b\!-\!c)$. In the bottom panel of panel C, we illustrate an example for $n=3$.
Figure 4: Evolutionary dynamics of reactive-$n$ strategies. To explore the evolutionary dynamics among reactive-$n$ strategies, we run simulations based on the method of Imhof and Nowak imhof:royal:2010. This method assumes rare mutations. Every time a mutant strategy appears, it goes extinct or fixes before the arrival of the next mutant strategy. A,B, We run ten independent simulations for reactive-$n$ strategies and for reactive-$n$ counting strategies. For each simulation, we record the most abundant strategy (the strategy that resisted most mutants). The respective average cooperation probabilities are in line with the conditions for partner strategies. C,D, With additional simulations, we explore the average abundance of partner strategies and the population's average cooperation rate. For a given resident strategy to be classified as a partner by our simulation, it needs to satisfy all inequalities in the respective definition of partner strategies. In addition, it needs to cooperate after full cooperation with a probability of at least 95%. For all considered parameter values, we only observe high cooperation rates when partner strategies evolve. Simulations are based on a donation game with $b\!=\!1$, $c\!=\!0.5$, a selection strength $\beta\!=\!1$ and a population size $N\!=\!100$, unless noted otherwise. For $n$ equal to 1 and 2, simulations are run for $T\!=\! 10 ^ 7$ time steps. For $n\!=\!3$ we use $T\!=\! 2 \!\cdot\!10 ^ 7$ time steps.

Conditional cooperation with longer memory

Abstract

Conditional cooperation with longer memory

Authors

Abstract

Table of Contents

Figures (4)