Coevolution of cognition and cooperation in structured populations under reinforcement learning

Rossana Mastrandrea; Leonardo Boncinelli; Ennio Bilancini

Coevolution of cognition and cooperation in structured populations under reinforcement learning

Rossana Mastrandrea, Leonardo Boncinelli, Ennio Bilancini

Abstract

We study the evolution of behavior under reinforcement learning in a Prisoner's Dilemma where agents interact in a regular network and can learn about whether they play one-shot or repeatedly by incurring a cost of deliberation. With respect to other behavioral rules used in the literature, (i) we confirm the existence of a threshold value of the probability of repeated interaction, switching the emergent behavior from intuitive defector to dual-process cooperator; (ii) we find a different role of the node degree, with smaller degrees reducing the evolutionary success of dual-process cooperators; (iii) we observe a higher frequency of deliberation.

Coevolution of cognition and cooperation in structured populations under reinforcement learning

Abstract

Paper Structure (3 sections, 7 equations, 2 figures)

This paper contains 3 sections, 7 equations, 2 figures.

Emergence of dual-process cooperation.
Dual-process cooperation and node degree.
Frequency of deliberation.

Figures (2)

Figure 1: Strategy evolution and network structure. (a) Probability of cooperating under intuition, $p_{int}$, as function of the probability to have a repeated game, $p_G$. (b) Critical value of the probability that the interaction is repeated, $p_G$, for which $p_{int}=0.5$ across the different values of $k \in \{2,4,6,8,20,40\}$. (c) Probability of cooperating under deliberation when the game is repeated, $p_{del}^{rep}$, as function of the probability that the game is repeated, $p_G$. (d) Probability of cooperating under deliberation if the game is one-shot, $p_{del}^{1s}$, as function of the probability that the game is repeated, $p_G$. (e) Maximum threshold cost of deliberation, $d*$, as function of the probability that the game is repeated, $p_G$. (f) Maximum value of the deliberation cost across the whole discrete range of variability of the probability to have a repeated game, $p_G \in \{0,0.1,\dots,1\}$. In all panels the number of each node's neighbours is fixed, $k \in\{2,4,6,8,20,40\}$.
Figure 2: Dual-process cooperation and network structure. Heat-maps of the (a) probability of cooperating under intuition, $p_{int}$, and (b) the maximum threshold cost of deliberation, $d^*$, as a function of the number of neighbours (i.e., the number of games played by each agent in each round), $k$, and the probability to have a repeated game, $p_G$.

Coevolution of cognition and cooperation in structured populations under reinforcement learning

Abstract

Coevolution of cognition and cooperation in structured populations under reinforcement learning

Authors

Abstract

Table of Contents

Figures (2)