A brief review of evolutionary game dynamics in the reinforcement learning paradigm
Guozhong Zheng, Xin Ou, Shengfeng Deng, Jiqiang Zhang, Li Chen
TL;DR
The paper addresses the mismatch between imitation-based evolutionary game models and observed behaviors by advocating reinforcement learning (RL) as a unifying, introspective learning paradigm. It surveys how RL—through mechanisms like Q-learning and policy-based methods—can explain the emergence of cooperation, trust, fairness, and efficient resource allocation, as well as ecological coexistence, by optimizing long-term payoffs rather than copying successful peers. Key contributions include demonstrating RL-driven cooperation in pairwise and multi-agent games, endogenous trust dynamics, fairness generation via historical experience and foresight, and improved resource coordination with phase-transition-like behavior, plus RL-based ecological insights on predator–prey and biodiversity. The findings suggest RL provides a cohesive framework for understanding complex social and ecological phenomena, though empirical validation with human and animal behavior remains essential for a full theoretical synthesis.
Abstract
Cooperation, fairness, trust, and resource coordination are cornerstones of modern civilization, yet their emergence remains inadequately explained by the persistent discrepancies between theoretical predictions and behavioral experiments. Part of this gap may arise from the imitation learning paradigm commonly used in prior theoretical models, which assumes individuals merely copy successful neighbors according to predetermined, fixed rules. This review examines recent advances in evolutionary game dynamics that employ reinforcement learning (RL) as an alternative paradigm. In RL, individuals learn through trial and error and introspectively refine their strategies based on environmental feedback. We begin by introducing key concepts in evolutionary game theory and the two learning paradigms, then synthesize progress in applying RL to elucidate cooperation, trust, fairness, optimal resource coordination, and ecological dynamics. Collectively, these studies indicate that RL offers a promising unified framework for understanding the diverse social and ecological phenomena observed in human and natural systems.
