Table of Contents
Fetching ...

Bayesian tit-for-tat fosters cooperation in evolutionary stochastic games

Arunava Patra, Supratim Sengupta, Sagar Chakraborty

TL;DR

The paper investigates how Bayesian inferential strategies influence cooperation in evolutionary stochastic games where actions alter environmental states. It introduces BTFT, where a Bayesian player infers an opponent's reactive strategy from observed actions via Bayes' rule and then adopts the inferred strategy (posterior maximum) in the next round; this is analyzed against reactive strategies in a two-state resource environment with payoff matrices parameterized by $r_1$ and $r_2$ and a discount factor $oldsymbol{\delta}$. Through ESS-phase diagrams and imitation-based mutation-selection dynamics, the study shows BTFT is evolutionarily robust against many reactive strategies, and that it generally enhances cooperation and the occupancy of the beneficial state, though the results depend on the transition rule ${m \tau}$ linking state changes to actions. The findings highlight the potential for Bayesian learning to support cooperative behavior in dynamic social dilemmas, while outlining conditions under which such strategies remain vulnerable and suggesting avenues for richer cognitive-model extensions.

Abstract

Learning from experience is a key feature of decision-making in cognitively complex organisms. Strategic interactions involving Bayesian inferential strategies can enable us to better understand how evolving individual choices to be altruistic or selfish can affect collective outcomes in social dilemmas. Bayesian strategies are distinguished, from their reactive opponents, in their ability to modulate their actions in the light of new evidence. We investigate whether such strategies can be resilient against reactive strategies when actions not only determine the immediate payoff but can affect future payoffs by changing the state of the environment. We use stochastic games to mimic the change in environment in a manner that is conditioned on the players' actions. By considering three distinct rules governing transitions between a resource-rich and a resource-poor states, we ascertain the conditions under which Bayesian tit-for-tat strategy can resist being invaded by reactive strategies. We find that the Bayesian strategy is resilient against a large class of reactive strategies and is more effective in fostering cooperation leading to sustenance of the resource-rich state. However, the extent of success of the Bayesian strategies depends on the other strategies in the pool and the rule governing transition between the two different resource states.

Bayesian tit-for-tat fosters cooperation in evolutionary stochastic games

TL;DR

The paper investigates how Bayesian inferential strategies influence cooperation in evolutionary stochastic games where actions alter environmental states. It introduces BTFT, where a Bayesian player infers an opponent's reactive strategy from observed actions via Bayes' rule and then adopts the inferred strategy (posterior maximum) in the next round; this is analyzed against reactive strategies in a two-state resource environment with payoff matrices parameterized by and and a discount factor . Through ESS-phase diagrams and imitation-based mutation-selection dynamics, the study shows BTFT is evolutionarily robust against many reactive strategies, and that it generally enhances cooperation and the occupancy of the beneficial state, though the results depend on the transition rule linking state changes to actions. The findings highlight the potential for Bayesian learning to support cooperative behavior in dynamic social dilemmas, while outlining conditions under which such strategies remain vulnerable and suggesting avenues for richer cognitive-model extensions.

Abstract

Learning from experience is a key feature of decision-making in cognitively complex organisms. Strategic interactions involving Bayesian inferential strategies can enable us to better understand how evolving individual choices to be altruistic or selfish can affect collective outcomes in social dilemmas. Bayesian strategies are distinguished, from their reactive opponents, in their ability to modulate their actions in the light of new evidence. We investigate whether such strategies can be resilient against reactive strategies when actions not only determine the immediate payoff but can affect future payoffs by changing the state of the environment. We use stochastic games to mimic the change in environment in a manner that is conditioned on the players' actions. By considering three distinct rules governing transitions between a resource-rich and a resource-poor states, we ascertain the conditions under which Bayesian tit-for-tat strategy can resist being invaded by reactive strategies. We find that the Bayesian strategy is resilient against a large class of reactive strategies and is more effective in fostering cooperation leading to sustenance of the resource-rich state. However, the extent of success of the Bayesian strategies depends on the other strategies in the pool and the rule governing transition between the two different resource states.

Paper Structure

This paper contains 19 sections, 27 equations, 7 figures.

Figures (7)

  • Figure 1: A schematic figure showing the interaction between a Bayesian player (shown in blue) and a reactive player (shown in pink) over time. The interaction between the two can occur in game state $s^1$ (represented by a solid green line) or state $s^2$ (represented by a dashed red line). We assume that all interactions initially start in state $s^1$ and the transition vector is $\tau=(1,0,0,0,1,0,0,0)$ such that only mutual cooperation in state $s^1$ ensures that both players remain in state $s^1$ and only mutual cooperation in state $s^2$ leads to a switch from state $s^2$ to state $s^1$. The Bayesian player uses a Bayesian inference engine (depicted in the right half of the figure) to infer the strategy of her opponent at the end of each round and then adopts the inferred strategy as her own strategy in the subsequent round. The inference engine starts with a prior hypothesis $(H)$ about the opponent's strategy and uses the actions of the opponent as evidence $(E)$ to continuously update her hypothesis over time following Bayes' rule.
  • Figure 2: ESS phase diagram for BTFT vs. reactive strategies in an infinite population for the transition vector ${\bm \tau_{00}}$. The color gradient, ranging from 0 to 1, indicates the proportion of the reactive strategy in the mixed ESS. Blue corresponds to the scenario where both BTFT and the reactive strategy are an ESS; green indicates that only BTFT is an ESS; while red indicates that only the reactive strategy is an ESS.
  • Figure 3: ESS phase diagram for BTFT vs. reactive strategies in an infinite population for the transition vector ${\bm \tau_{10}}$. The color scheme followed is the same as in Fig. \ref{['fig:ess_q00']}
  • Figure 4: ESS phase diagram for BTFT vs. reactive strategies in infinite population for the transition vector ${\bm \tau_{11}}$. The color scheme followed is the same as in Fig. \ref{['fig:ess_q00']}
  • Figure 5: Evolution of cooperation driven by BTFT: The average self-cooperation rate and probability of being in the more beneficial game state are shown for two different mutation-selection processes. The first, second, and third columns, respectively, show the results for the transition vectors ${\bm \tau_{00}}$, ${\bm \tau_{10}}$ and ${\bm \tau_{11}}$. The dashed line and the solid line represent the mutation-selection process with the mutation rate of BTFT as $\frac{1}{9}$ and $\frac{1}{2}$, respectively. The first row (a)-(c) depicts the cooperation level of the population over time, when the population starts from ALLD regardless of the resource's state. The second row (d)-(f) illustrates the frequency of beneficial states over time. The dotted line corresponds to the cooperation level in the absence of BTFT. The third and fourth rows exhibit the frequency of the strategies after $10^4$ generations when the mutation rate of BTFT is $\frac{1}{2}$ (panels g-i) and $\frac{1}{9}$ (panels j-l), respectively. Parameters used: $\beta=10$, $N=100$, $\delta=0.9$, $r_1=10$ and $r_2=2$.
  • ...and 2 more figures