Table of Contents
Fetching ...

Coevolution of cognition and cooperation in structured populations under reinforcement learning

Rossana Mastrandrea, Leonardo Boncinelli, Ennio Bilancini

Abstract

We study the evolution of behavior under reinforcement learning in a Prisoner's Dilemma where agents interact in a regular network and can learn about whether they play one-shot or repeatedly by incurring a cost of deliberation. With respect to other behavioral rules used in the literature, (i) we confirm the existence of a threshold value of the probability of repeated interaction, switching the emergent behavior from intuitive defector to dual-process cooperator; (ii) we find a different role of the node degree, with smaller degrees reducing the evolutionary success of dual-process cooperators; (iii) we observe a higher frequency of deliberation.

Coevolution of cognition and cooperation in structured populations under reinforcement learning

Abstract

We study the evolution of behavior under reinforcement learning in a Prisoner's Dilemma where agents interact in a regular network and can learn about whether they play one-shot or repeatedly by incurring a cost of deliberation. With respect to other behavioral rules used in the literature, (i) we confirm the existence of a threshold value of the probability of repeated interaction, switching the emergent behavior from intuitive defector to dual-process cooperator; (ii) we find a different role of the node degree, with smaller degrees reducing the evolutionary success of dual-process cooperators; (iii) we observe a higher frequency of deliberation.
Paper Structure (3 sections, 7 equations, 2 figures)

This paper contains 3 sections, 7 equations, 2 figures.

Figures (2)

  • Figure 1: Strategy evolution and network structure. (a) Probability of cooperating under intuition, $p_{int}$, as function of the probability to have a repeated game, $p_G$. (b) Critical value of the probability that the interaction is repeated, $p_G$, for which $p_{int}=0.5$ across the different values of $k \in \{2,4,6,8,20,40\}$. (c) Probability of cooperating under deliberation when the game is repeated, $p_{del}^{rep}$, as function of the probability that the game is repeated, $p_G$. (d) Probability of cooperating under deliberation if the game is one-shot, $p_{del}^{1s}$, as function of the probability that the game is repeated, $p_G$. (e) Maximum threshold cost of deliberation, $d*$, as function of the probability that the game is repeated, $p_G$. (f) Maximum value of the deliberation cost across the whole discrete range of variability of the probability to have a repeated game, $p_G \in \{0,0.1,\dots,1\}$. In all panels the number of each node's neighbours is fixed, $k \in\{2,4,6,8,20,40\}$.
  • Figure 2: Dual-process cooperation and network structure. Heat-maps of the (a) probability of cooperating under intuition, $p_{int}$, and (b) the maximum threshold cost of deliberation, $d^*$, as a function of the number of neighbours (i.e., the number of games played by each agent in each round), $k$, and the probability to have a repeated game, $p_G$.