Table of Contents
Fetching ...

A memory-based spatial evolutionary game with the dynamic interaction between learners and profiteers

Bin Pi, Minyu Feng, Liang-Jian Deng

TL;DR

This work addresses how memory and dynamic role-switching between profit-seeking profiteers and self-learning learners shape cooperation in spatial evolutionary games. It introduces a memory-based Snowdrift game on networks where learners update via $Q$-learning and profiteers update via the $Fermi$ rule, with individuals switching roles according to a two-state Markov process and payoffs incorporating memory through $M$ and $\beta$. A key contribution is deriving the stationary distribution $\pi_1=q/(p+q)$ and $\pi_2=p/(p+q)$ to predict long-run counts, and showing through simulations that dynamic interactions and memory jointly promote cooperation, with higher learning rates $\alpha$ and smaller discount factors $\gamma$ further enhancing it; results are robust to network size and topology. The findings offer mechanistic insight into how memory and mixed-learning/profit-seeking behaviors sustain cooperation in structured populations, with implications for designing AI-driven social systems and multi-agent environments.

Abstract

Spatial evolutionary games provide a valuable framework for elucidating the emergence and maintenance of cooperative behavior. However, most previous studies assume that individuals are profiteers and neglect to consider the effects of memory. To bridge this gap, in this paper, we propose a memory-based spatial evolutionary game with dynamic interaction between learners and profiteers. Specifically, there are two different categories of individuals in the network, including profiteers and learners with different strategy updating rules. Notably, there is a dynamic interaction between profiteers and learners, i.e., each individual has the transition probability between profiteers and learners, which is portrayed by a Markov process. Besides, the payoff of each individual is not only determined by a single round of the game but also depends on the memory mechanism of the individual. Extensive numerical simulations validate the theoretical analysis and uncover that dynamic interactions between profiteers and learners foster cooperation, memory mechanisms facilitate the emergence of cooperative behaviors among profiteers, and increasing the learning rate of learners promotes a rise in the number of cooperators. In addition, the robustness of the model is verified through simulations across various network sizes. Overall, this work contributes to a deeper understanding of the mechanisms driving the formation and evolution of cooperation.

A memory-based spatial evolutionary game with the dynamic interaction between learners and profiteers

TL;DR

This work addresses how memory and dynamic role-switching between profit-seeking profiteers and self-learning learners shape cooperation in spatial evolutionary games. It introduces a memory-based Snowdrift game on networks where learners update via -learning and profiteers update via the rule, with individuals switching roles according to a two-state Markov process and payoffs incorporating memory through and . A key contribution is deriving the stationary distribution and to predict long-run counts, and showing through simulations that dynamic interactions and memory jointly promote cooperation, with higher learning rates and smaller discount factors further enhancing it; results are robust to network size and topology. The findings offer mechanistic insight into how memory and mixed-learning/profit-seeking behaviors sustain cooperation in structured populations, with implications for designing AI-driven social systems and multi-agent environments.

Abstract

Spatial evolutionary games provide a valuable framework for elucidating the emergence and maintenance of cooperative behavior. However, most previous studies assume that individuals are profiteers and neglect to consider the effects of memory. To bridge this gap, in this paper, we propose a memory-based spatial evolutionary game with dynamic interaction between learners and profiteers. Specifically, there are two different categories of individuals in the network, including profiteers and learners with different strategy updating rules. Notably, there is a dynamic interaction between profiteers and learners, i.e., each individual has the transition probability between profiteers and learners, which is portrayed by a Markov process. Besides, the payoff of each individual is not only determined by a single round of the game but also depends on the memory mechanism of the individual. Extensive numerical simulations validate the theoretical analysis and uncover that dynamic interactions between profiteers and learners foster cooperation, memory mechanisms facilitate the emergence of cooperative behaviors among profiteers, and increasing the learning rate of learners promotes a rise in the number of cooperators. In addition, the robustness of the model is verified through simulations across various network sizes. Overall, this work contributes to a deeper understanding of the mechanisms driving the formation and evolution of cooperation.
Paper Structure (12 sections, 8 equations, 9 figures, 4 tables)

This paper contains 12 sections, 8 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: An illustration of the model. In this figure, we provide an example of the proposed model. Profiteers and learners interact dynamically in the network, with different categories of individuals adopting different strategy updating rules to update their strategies. Besides, the category of each individual changes from learner (resp. profiteer) to profiteer (resp. learner) with the probability $p$ (resp.$q$).
  • Figure 2: Evolutionary curves and statistical distributions of the number of learners under different transition probabilities. The green squares represent the results under $p = 0.5, q = 0.5$, while the blue diamonds and black triangles stand for the results under $p = 0.5, q = 0.8$ and $p = 0.8, q = 0.5$, respectively, with only one parameter different from green. The red line indicates the theoretical result derived from Eq. \ref{['Expectation of scale']}. We set the $x$-axis and $y$-axis as evolutionary time vs. number of learners for (a) evolutionary curves. In (b) statistical distributions, the $x$-axis represents the number of learners, while the $y$-axis shows the corresponding probability. It can be seen that the number of learners becomes stable around $t = 100$ and the numerical simulation results are in agreement with the theory.
  • Figure 3: Heat maps of cooperation ratio regarding transition probabilities $p$ and $q$. This figure elucidates the impact of transition probabilities $p$ ($y$-axis) and $q$ ($x$-axis) between profiteers and learners on cooperative behavior on the (a) SL and (b) WS networks. The parameters for the cost-to-benefit of SDG, memory decay factor, and memory length on both networks are fixed to $r = 0.6$, $\beta = 0.5$, and $M = 5$ respectively. These heat maps provide a comprehensive visualization of how varying $p$ and $q$ affect cooperative dynamics within different network topologies. We observe that the inclusion of learners boosts the frequency of cooperators compared to pure profiteers.
  • Figure 4: Heat maps of cooperation frequency regarding memory length and memory decay factor under the coexistence of profiteers and learners. In this figure, we illustrate the influence of memory decay factor $\beta$ and memory length $M$ on the frequency of cooperators on the SL (in panel (a)) and WS (in panel (b)) networks under the coexistence of profiteers and learners. We set the transition probabilities between learners and profiteers to $p = q = 0.5$, ensuring a balanced presence of learners and profiteers in the network. Additionally, the payoff parameter of the SDG is set to $r = 0.3$. These heat maps offer insights into how different combinations of memory parameters influence cooperative behavior in networks with varied structures. Both networks show that increasing memory length and memory decay factor inhibit cooperative behavior.
  • Figure 5: Heat maps of cooperation frequency regarding memory length and memory decay factor in the case of pure profiteers. This figure presents the influence of memory decay factor $\beta$ ($y$-axis) and memory length $M$ ($x$-axis) on the cooperative behavior on the SL (in subplot (a)) and WS (in subplot (b)) networks under conditions of pure profiteers. Hereby, transition probabilities between learners and profiteers are fixed to $p = 1$ and $q = 0$, resulting in all the individuals in the network being profiteers, with no learners existing. Furthermore, the payoff parameter of SDG is set to $r = 0.3$. These heat maps provide insights into how memory parameters influence cooperative behavior in networks dominated by profiteers. It is evident that memory mechanisms favor the survival of cooperators in groups of pure profiteers.
  • ...and 4 more figures