Table of Contents
Fetching ...

The Evolution of Altruistic Rationality Provides a Solution to Social Dilemmas via Rational Reciprocity

Mohammad Salahshour, Iain D. Couzin

TL;DR

This work tackles how cooperation can evolve among rational actors by introducing an evolvable, altruism-driven subjective payoff that distorts objective payoffs. Through an indirect-evolutionary framework, agents learn diverse rational personalities and engage in rational reciprocity, turning dilemmas into coordination problems and promoting widespread cooperation across key two-by-two games. The findings show robust cooperation in both well-mixed and structured populations, with minimal reliance on network reciprocity and without hard-coded cooperative rules. The approach bridges rational decision-making with indirect reciprocity concepts, offering a universal mechanism for cooperation under perfect information and suggesting extensions to more complex settings.

Abstract

Decades of scientific inquiry have sought to understand how evolution fosters cooperation, a concept seemingly at odds with the belief that evolution should produce rational, self-interested individuals. Most previous work has focused on the evolution of cooperation among boundedly rational individuals whose decisions are governed by behavioral rules that do not need to be rational. Here, using an evolutionary model, we study how altruism can evolve in a community of rational agents and promote cooperation. We show that in both well-mixed and structured populations, a population of objectively rational agents is readily invaded by mutant individuals who make rational decisions but evolve a distorted (i.e., subjective) perception of their payoffs. This promotes behavioral diversity and gives rise to the evolution of rational, other-regarding agents who naturally solve all the known strategic problems of two-person, two-strategy games by perceiving their games as pure coordination games.

The Evolution of Altruistic Rationality Provides a Solution to Social Dilemmas via Rational Reciprocity

TL;DR

This work tackles how cooperation can evolve among rational actors by introducing an evolvable, altruism-driven subjective payoff that distorts objective payoffs. Through an indirect-evolutionary framework, agents learn diverse rational personalities and engage in rational reciprocity, turning dilemmas into coordination problems and promoting widespread cooperation across key two-by-two games. The findings show robust cooperation in both well-mixed and structured populations, with minimal reliance on network reciprocity and without hard-coded cooperative rules. The approach bridges rational decision-making with indirect reciprocity concepts, offering a universal mechanism for cooperation under perfect information and suggesting extensions to more complex settings.

Abstract

Decades of scientific inquiry have sought to understand how evolution fosters cooperation, a concept seemingly at odds with the belief that evolution should produce rational, self-interested individuals. Most previous work has focused on the evolution of cooperation among boundedly rational individuals whose decisions are governed by behavioral rules that do not need to be rational. Here, using an evolutionary model, we study how altruism can evolve in a community of rational agents and promote cooperation. We show that in both well-mixed and structured populations, a population of objectively rational agents is readily invaded by mutant individuals who make rational decisions but evolve a distorted (i.e., subjective) perception of their payoffs. This promotes behavioral diversity and gives rise to the evolution of rational, other-regarding agents who naturally solve all the known strategic problems of two-person, two-strategy games by perceiving their games as pure coordination games.

Paper Structure

This paper contains 12 sections, 6 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Subjective rationality in symmetric two-person two-strategy games. a: A symmetric two-person, two-strategy game is defined by two strategies ($C$ and $D$) and four payoff parameters. The first letter in each cell shows the payoff of the row player, and the second letter shows the payoff of the column player. Caring for others transforms the game into a game where non-diagonal elements of the payoff matrix are a linear superposition of one's own and the opponent's payoff. The coordination force induces cooperation when the value of the altruistic trait of the individual is larger than $\delta_C=\frac{T-R}{T-S}$, and the punishment force induces defection when it is smaller than $\delta_D=\frac{P-S}{T-S}$. Players' personality types can be classified based on their coordination and punishment force. b: The probability of cooperation in the Nash equilibrium of the games as a function of own and opponent's altruistic trait value are color plotted. $\delta_C$ (red) and $\delta_D$ (blue) are superimposed. Prisoner's Dilemma shows two phases for $\delta_D<\delta_C$ and $\delta_C<\delta_D$. In the former (i) conditional defectors who punish altruists and reward selfishness coexist with (unconditional) cooperators and defectors. In the latter (ii), conditional cooperators who reward altruists and punish selfishness replace conditional defectors. Cooperation evolves only in this phase. In the Snowdrift game (iii), cooperators and conditional defectors, who defect with altruists and play a mixed strategy among themselves, coexist. In the Stag Hunt game (iv), cooperators and conditional cooperators, who cooperate with altruists but play a mixed strategy among themselves, coexist.
  • Figure 2: The evolution of altruism due to the rational punishment of selfishness. a and b: Population-average of individuals' value of the altruistic trait, $\bar{\delta}$ (a), and the density of cooperators, $\rho_C$ (b), as a function of time is plotted. The evolutionary simulation starts with a population of selfish rationals with zero value of the altruistic trait who fail to cooperate in the Prisoner's Dilemma. However, for $R>4$, altruist rationals evolve and dominate the population, and cooperation reaches a high level. Both the population average value of altruistic trait (altruism) and density of cooperation show fluctuations, suggesting a cyclic dominance of individuals with different altruism. c and d: The strategy played by (a) and against an individual (d) with a given value of the altruistic trait, $\delta$. Individuals with high values of altruistic traits cooperate, while those with low values of altruistic trait defect. Surprisingly, a higher value of altruistic traits also leads to receiving more cooperation and less defection. This curious phenomenon underlines the evolution of cooperation resulting from rational reciprocity. Parameter values: $N=10000$ and $\nu=10^{-3}$. A well-mixed population is used. Individuals play a Prisoner's Dilemma, $T=5$, $P=1$, $S=0$. In a, $R$ is indicated in the legend, and in b, $R=4.1$.
  • Figure 3: The evolution of altruistic rationality in the Prisoner's Dilemma. a and b: The density plot of the individuals' altruism as a function of time in a well-mixed (a) and structured (b) population. Individuals show behavioral diversity and can be decomposed into defectors with a low value of the altruistic trait, conditional cooperators, and cooperators with a high value of the altruistic trait, showing rock-paper-scissors-like dynamics. c and d: The density plot of the time-average value of the altruistic trait in a well-mixed (c) and structured (d) population as a function of the benefit of cooperation, $R$, are plotted. Comparison of individuals' value of altruistic trait with $\delta_C$ (green dashed) and $\delta_D$ (red dashed) allows identification of different personality types. Below, $R=T+S-P=4$, defectors dominate, and above $R=4$, cooperators, conditional cooperators, and defectors coexist. Increasing $R$ increases the density of cooperators and decreases the density of defectors and conditional cooperators, especially in a structured population. e and f: The population-average value of altruistic trait (e), and the proportion of cooperation, $\rho_C$ (f), in a structured and well-mixed population as a function of the benefit of cooperation, $R$ are plotted. The purple line shows the result of the replicator dynamics expected to be an exact solution of the model in a well-mixed population and in the infinite population limit. Parameter values: $N=10000$, $\nu=10^{-3}$. In (a), $R=4.1$.
  • Figure 4: The evolution of cooperation by subjective rationality in the Snowdrift and Stag Hunt games. The density plot of the time-average value of the altruistic trait of the individuals as a function of the benefit of cooperation, $R$, in the Snowdrift game a, and the temptation to defect, $T$, in the Stag Hunt game b. a: In the Snowdrift game, individuals self-organize on the line defined by $\delta=\delta_D$ and are marginally unconditional cooperators. As a result, full cooperation evolves. b: In the Stag Hunt game, individuals evolve above the line defined by $\delta=\delta_C$. Consequently, a monomorphic population of unconditional cooperators evolves who successfully avoid coordination failure. Above $T=3$, the game becomes a Prisoner's Dilemma, and the coexistence of cooperators, unconditional cooperators, and defectors is observed. For too high $T$, defectors dominate. Parameter values: $N=10000$, $\nu=10^{-3}$.
  • Figure 5: Comparison with network reciprocity. a to c: The outcome of evolutionary dynamics in a population of subjectively rational agents in a structured population (blue) and using replicator dynamics (purple, well-mixed population in the limit of large population size) is compared with that in a population of boundedly rational agents in a structured population (structured I, orange, and structured II, red) for Prisoner's dilemma (a), Snowdrift (b), and the Stag Hunt game (c). The Nash equilibrium of the game (mixed or pure), expected to occur in a well-mixed population, is plotted by a dashed black line. Structured I indicates an evolutionary algorithm in which individuals reproduce with a probability proportional to the exponential of a selection parameter $\beta=0.5$, times their payoff ($\exp(\beta \pi$)), and structured II indicates an evolutionary algorithm in which individuals reproduce with a probability proportional to their payoff (this algorithm is also used for subjectively rational individuals here). Subjectively rational agents outperform bounded rational agents and reach higher cooperation in all three games. Cooperation in a structured population remains close to that predicted by the replicator dynamics, indicating population structure does not play a significant role in promoting cooperation. Parameter values: $N=10000$ and $\nu=10^{-3}$. The population resides on a first-nearest-neighbor network with Von Neumann connectivity and periodic boundaries.