Game-Theoretic Multiagent Reinforcement Learning
Yaodong Yang, Chengdong Ma, Zihan Ding, Stephen McAleer, Chi Jin, Jun Wang, Tuomas Sandholm
TL;DR
The paper surveys the game-theoretic foundations and modern advances of multiagent reinforcement learning, connecting classic equilibrium concepts with contemporary algorithmic frameworks. It covers single-agent RL as a prelude, then extends to stochastic and extensive-form games, partially observable settings, and mean-field regimes to address scalability. Key contributions include a comprehensive taxonomy of MARL algorithms, rigorous treatment of equilibrium concepts (NE, CE, CCE), and detailed discussions of challenges such as non-stationarity, combinatorial complexity, and learning in large populations. The work highlights future directions across theory, safety, model-based approaches, meta-learning, and the integration of foundation models to push MARL toward robust, scalable real-world deployments.
Abstract
Tremendous advances have been made in multiagent reinforcement learning (MARL). MARL corresponds to the learning problem in a multiagent system in which multiple agents learn simultaneously. It is an interdisciplinary field of study with a long history that includes game theory, machine learning, stochastic control, psychology, and optimization. Despite great successes in MARL, there is a lack of a self-contained overview of the literature that covers game-theoretic foundations of modern MARL methods and summarizes the recent advances. The majority of existing surveys are outdated and do not fully cover the recent developments since 2010. In this work, we provide a monograph on MARL that covers both the fundamentals and the latest developments on the research frontier. The goal of this monograph is to provide a self-contained assessment of the current state-of-the-art MARL techniques from a game-theoretic perspective. We expect this work to serve as a stepping stone for both new researchers who are about to enter this fast-growing field and experts in the field who want to obtain a panoramic view and identify new directions based on recent advances.
