Adaptive coordination promotes collective cooperation in repeated social dilemmas

Feipeng Zhang; Te Wu; Long Wang

Adaptive coordination promotes collective cooperation in repeated social dilemmas

Feipeng Zhang, Te Wu, Long Wang

Abstract

Direct reciprocity based on the repeated prisoner's dilemma has been intensively studied. Most theoretical investigations have concentrated on memory-$1$ strategies, a class of elementary strategies just reacting to the previous-round outcomes. Though the properties of "All-or-None" strategies ($AoN_K$) have been discovered, simulations just confirmed the good performance of $AoN_K$ of very short memory lengths. It remains unclear how $AoN_K$ strategies would fare when players have access to longer rounds of history information. We construct a theoretical model to investigate the performance of the class of $AoN_K$ strategies of varying memory length $K$. We rigorously derive the payoffs and show that $AoN_K$ strategies of intermediate memory length $K$ are most prevalent, while strategies of larger memory lengths are less competent. Larger memory lengths make it hard for $AoN_K$ strategies to coordinate, and thus inhibiting their mutual reciprocity. We then propose the adaptive coordination strategy combining tolerance and $AoN_K$' coordination rule. This strategy behaves like $AoN_K$ strategy when coordination is not sufficient, and tolerates opponents' occasional deviations by still cooperating when coordination is sufficient. We found that the adaptive coordination strategy wins over other classic memory-$1$ strategies in various typical competition environments, and stabilizes the population at high levels of cooperation, suggesting the effectiveness of high level adaptability in resolving social dilemmas. Our work may offer a theoretical framework for exploring complex strategies using history information, which are different from traditional memory-$n$ strategies.

Adaptive coordination promotes collective cooperation in repeated social dilemmas

Abstract

Direct reciprocity based on the repeated prisoner's dilemma has been intensively studied. Most theoretical investigations have concentrated on memory-

strategies, a class of elementary strategies just reacting to the previous-round outcomes. Though the properties of "All-or-None" strategies (

) have been discovered, simulations just confirmed the good performance of

of very short memory lengths. It remains unclear how

strategies would fare when players have access to longer rounds of history information. We construct a theoretical model to investigate the performance of the class of

strategies of varying memory length

. We rigorously derive the payoffs and show that

strategies of intermediate memory length

are most prevalent, while strategies of larger memory lengths are less competent. Larger memory lengths make it hard for

strategies to coordinate, and thus inhibiting their mutual reciprocity. We then propose the adaptive coordination strategy combining tolerance and

' coordination rule. This strategy behaves like

strategy when coordination is not sufficient, and tolerates opponents' occasional deviations by still cooperating when coordination is sufficient. We found that the adaptive coordination strategy wins over other classic memory-

strategies in various typical competition environments, and stabilizes the population at high levels of cooperation, suggesting the effectiveness of high level adaptability in resolving social dilemmas. Our work may offer a theoretical framework for exploring complex strategies using history information, which are different from traditional memory-

strategies.

Paper Structure (8 sections, 5 equations, 6 figures)

This paper contains 8 sections, 5 equations, 6 figures.

Model description.
$AoN_K$ strategies and $ADCO$ strategies with arbitrary cooperation threshold $K$
Payoffs against selected strategies.
Evolutionary dynamics of two-strategy competition.
Evolutionary dynamics of multi-strategy competition.
Payoff calculation.
Replicator dynamics for pairwise competitions.
Full population dynamics for competitions between $ADCO$ and memory-$1$ strategies.

Figures (6)

Figure 1: Illustration of $AoN_K$ strategies and $ADCO$ strategies.(A) We say the past $I (I=1,2,3,...)$ rounds are coordinated if all players have chosen the same action in each of the previous $I$ consecutive rounds. $AoN_K$ cooperates if the past $K$ rounds are coordinated. Otherwise, it defects. We call $K$ the cooperation threshold. Clearly, $K$ is also the minimum memory length required to implement the $AoN_K$ strategy. Depending on the length of the coordinated rounds so far, the past $K$-round outcomes can be classified as $K+1$ different states $A_i (i = 0, 1, 2,..., K$). $A_i$ represents that the past $i$-rounds are coordinated, and $A_0$ means that the most recent round is uncoordinated. Transitions between these $K+1$ states follow this way. When all players have chosen the same action in the current round, state $A_i$ (when $i < K$) transits to $A_{i+1}$ or stays in this state $A_i$ ($i=K$). When players have adopted different actions in the current round, state $A_i (i=0,1,..., K)$ transits to $A_0$. Following one round of uncoordinated actions, $AoN_K$ will immediately retaliate for the next $K$ rounds by defecting. (B) For large $K$, an accidental trembling hand can cause $AoN_K$ strategies to fall into a long-term defection, leading to the collapse of cooperation. To explore the nature of direct reciprocity in the context of large memory length, we propose the adaptive coordination ($ADCO$) strategy which incorporates forgiveness into the action-implementation process. An $ADCO$ player with cooperation threshold $K$ and tolerance $t$ behaves the same way as a $AoN_K$ player when less than $K$ uncoordinated rounds emerge. When $K$ coordinated rounds emerge, the $ADCO$ player cooperates and counts how many times state $A_K$ has been maintained continually. When the interactions have stayed in state $A_K$$t$ times, the $ADCO$ player sets up forgiveness in the sense that it still cooperates when one of the opponents defects but it re-counts. In other states, once the uncoordinated actions occur, the $ADCO$ player defects the next round and the state moves to state $A_0$. (C) State-to-state transitions and decision-making for $AoN_K$ and $ADCO$ strategies. State $A_{k,j}(j\in{1,2,\cdots , t})$ indicates that the state $A_K$ has lasted for $j$ rounds. Following a coordinated round, the state $A_K$ transitions to $A_{K,1}$, states $A_{K,j}$ transition to $A_{K,j+1}$ (where $j < t$), and state $A_{K,t}$ remains in its current state. However, after one round of uncoordinated actions, except for state $A_{K,t}$ transitioning to $A_K$, all other states transition to state $A_0$. Observing additional $t$ rounds, say observation phase, improves $ADCO$'s ability to distinguish opponents, especially between defectors and co-species. For interactions between $ADCO$ players and defectors, the state has little chance to transit to the observation phase. A $ADCO$ player behaves like a $AoN_K$ player and thus effectively eschews the exploitation. For interactions between $ADCO$ players themselves, they are more likely to cooperate many times, triggering their forgiveness. This no doubt enhances the mutual reciprocity between them. The parameter $t$ represents the tolerance level of the $ADCO$ strategy. The smaller $t$ the more forgivable. As $t$ tends to infinity, $ADCO$ with cooperation threshold $K$ is equivalent to $AoN_K$.
Figure 2: Cooperation rate for $AoN_K$ and $ADCO$ strategies for interactions between co-species.(A) For the same group size, the cooperation rate drastically declines as the cooperation threshold $K$ increases. However, $ADCO$ groups sustain a high level of cooperation for a wide range of $K$. (B) The cooperation dilemma becomes more challenging as the group size increases. Even so, $ADCO$ players still maintain cooperation level at $>75\%$ for group sizes as large as $200$. However, the cooperation rate dramatically drops as the group size expands. (C) Cooperation rate for different levels of tolerance, where $K=100$. $ADCO$ has limitations in maintaining cooperation when the group size or memory surpasses a certain threshold. A high cooperation rate is difficult to maintain for $ADCO$ of a low level of forgiveness. Parameters in (A) group size $N=5$ and $t=1$; (B) $K=5$ and $t=1$; (C) $K=100$ and $N=5$.
Figure 3: Interior fixed points $x_\ast$ of the replicator equation describing the population dynamics of $ADCO$ (or $AoN_K$) competing with $ALLD$. The population is well-mixed and of infinite size. Denote by $x$ the fraction of $ADCO$ with $t=2$ (or $AoN_K$), and thus $1-x$ represents the fraction of $ALLD$ players. For $K\ge2$, the replicator equation admits three fixed points, two trivial points $x=0$ and $x=1$, and an interior fixed point $x_\ast$. Arrows indicate the direction of selection. The population dynamics exhibit positive frequency dependency. When the initial frequency of $ADCO$ players is above $x_\ast$, the population dynamics would converge to $x=1$. Otherwise, $ALLD$ players would permeate into the whole population. Similar properties are observed when $AoN_K$ competes with $ALLD$ in the population. Increasing the cooperation threshold $K$ improves the ability of $ADCO$ (or $AoN_K$) to resist defectors' invasion as the attraction basin of $ADCO$ (or $AoN_K$) expands, while the improvement in $ADCO$ is more remarkable.
Figure 4: Evolutionary dynamics of pairwise competition. A large variety of classic strategies are selected to test the evolutionary performances of $ADCO$$(t=1)$ by vying with each of these classic strategies. Our findings reveal that $ADCO$ exhibits a remarkable ability to invade resident populations all using one type of strategy. It can easily overpower both the cooperative $ALLC$ strategy and the selfish $ALLD$ strategy. Furthermore, $ADCO$ also pervades into the whole population in competitions with tolerant strategies $TFT$, $GTFT_{0.2}$, and $GTFT_{0.5}$, respectively. It is worth noting that the success of $ADCO$ varies over the initial fractions of the resident strategies. For instance, while $ADCO(K=3)$ can invade the resident population of $Cumulative Reciprocity$ ($\Delta=2$) players, $ADCO(K=30)$ cannot. Please refer to the Supplementary Information for details of the strategies involved.
Figure 5: Evolutionary dynamics for competitions in the class of $AoN_K$ strategies. We consider a sufficiently large class of $50$ strategies $(K=1,2,\cdots,50)$. The abundance of strategies is peaked for the intermediate cooperation threshold. Move away from this optimal threshold to either side, the abundance drops. For $AoN_K$ with a low cooperation threshold, they exhibit robust cooperation amidst noise, yet are susceptible to incursion by strategies with high cooperation thresholds. In contrast, $AoN_K$ with a high cooperation threshold can exploit low cooperation thresholds, but interactions among larger $K$ strategies fail to achieve high levels of cooperation, thereby compromising the stability of homogeneous $AoN_K$ populations. Optimal effectiveness for $AoN_K$ strategies is observed at an intermediate cooperation threshold. Specifically, the strategy $AoN_{16}$ is the most abundant in the long run. Parameters: the population size $M=100$, implementation error $\varepsilon=0.01$, and the selection intensity $\beta=1$.
...and 1 more figures

Adaptive coordination promotes collective cooperation in repeated social dilemmas

Abstract

Adaptive coordination promotes collective cooperation in repeated social dilemmas

Authors

Abstract

Table of Contents

Figures (6)