Table of Contents
Fetching ...

The State-Action-Reward-State-Action Algorithm in Spatial Prisoner's Dilemma Game

Lanyu Yang, Dongchun Jiang, Fuqiang Guo, Mingjian Fu

TL;DR

This study employs the State-Action-Reward-State-Action (SARSA) algorithm as the decision-making mechanism for individuals in evolutionary game theory and applies SARSA to imitation learning, where agents select neighbors to imitate based on rewards.

Abstract

Cooperative behavior is prevalent in both human society and nature. Understanding the emergence and maintenance of cooperation among self-interested individuals remains a significant challenge in evolutionary biology and social sciences. Reinforcement learning (RL) provides a suitable framework for studying evolutionary game theory as it can adapt to environmental changes and maximize expected benefits. In this study, we employ the State-Action-Reward-State-Action (SARSA) algorithm as the decision-making mechanism for individuals in evolutionary game theory. Initially, we apply SARSA to imitation learning, where agents select neighbors to imitate based on rewards. This approach allows us to observe behavioral changes in agents without independent decision-making abilities. Subsequently, SARSA is utilized for primary agents to independently choose cooperation or betrayal with their neighbors. We evaluate the impact of SARSA on cooperation rates by analyzing variations in rewards and the distribution of cooperators and defectors within the network.

The State-Action-Reward-State-Action Algorithm in Spatial Prisoner's Dilemma Game

TL;DR

This study employs the State-Action-Reward-State-Action (SARSA) algorithm as the decision-making mechanism for individuals in evolutionary game theory and applies SARSA to imitation learning, where agents select neighbors to imitate based on rewards.

Abstract

Cooperative behavior is prevalent in both human society and nature. Understanding the emergence and maintenance of cooperation among self-interested individuals remains a significant challenge in evolutionary biology and social sciences. Reinforcement learning (RL) provides a suitable framework for studying evolutionary game theory as it can adapt to environmental changes and maximize expected benefits. In this study, we employ the State-Action-Reward-State-Action (SARSA) algorithm as the decision-making mechanism for individuals in evolutionary game theory. Initially, we apply SARSA to imitation learning, where agents select neighbors to imitate based on rewards. This approach allows us to observe behavioral changes in agents without independent decision-making abilities. Subsequently, SARSA is utilized for primary agents to independently choose cooperation or betrayal with their neighbors. We evaluate the impact of SARSA on cooperation rates by analyzing variations in rewards and the distribution of cooperators and defectors within the network.

Paper Structure

This paper contains 4 sections, 4 equations, 7 figures, 2 algorithms.

Figures (7)

  • Figure 1: Spatial distribution of cooperators (blue) and defectors (white) at different time steps when the Prisoner's dilemma parameter $D_r=0$, $D_g=0.02$. The first row is the evolution diagram of traditional individuals, and the second row shows the evolution diagram for agents trained using the SARSA algorithm.
  • Figure 2: Heat map of cooperation rate under different $D_r$ and $D_g$.The graph on the left is traditional, and the other use the SARSA algorithm.
  • Figure 3: Average reward of traditional and SARSA.
  • Figure 4: Average reward of cooperators and betrayers. The graph on the left is traditional, and the other use the SARSA algorithm.
  • Figure 5: Game evolution spot graph. White represents this individual as a betrayer but not the SARSA agent. Red represents this individual as a cooperator but not the SARSA agent. Blue represents this individual as a cooperator and the SARSA agent. Green represents this individual as a betrayer and the SARSA agent.
  • ...and 2 more figures