Learning in Mean Field Games: A Survey

Mathieu Laurière; Sarah Perrin; Julien Pérolat; Sertan Girgin; Paul Muller; Romuald Élie; Matthieu Geist; Olivier Pietquin

Learning in Mean Field Games: A Survey

Mathieu Laurière, Sarah Perrin, Julien Pérolat, Sertan Girgin, Paul Muller, Romuald Élie, Matthieu Geist, Olivier Pietquin

TL;DR

The survey addresses solving large-scale, symmetric multi-agent problems by leveraging mean-field game (MFG) theory and reinforcement learning (RL). It delineates static, stationary, evolutive, ergodic, and discounted mean-field settings, and connects classical MF equilibria with iterative BR/DP schemes and MF dynamics. By recasting mean-field problems as MFMDPs and MF environments, the authors survey model-free RL and deep RL methods that learn equilibria and social optima, including BR-based, policy-evaluation, and regularization techniques, with convergence insights and practical demonstrations in OpenSpiel. The work highlights convergence challenges, proposes damping and regularization strategies, and emphasizes scalable, data-driven approaches for high-dimensional MF problems with applications ranging from crowd dynamics to macroeconomic and networked systems.

Abstract

Non-cooperative and cooperative games with a very large number of players have many applications but remain generally intractable when the number of players increases. Introduced by Lasry and Lions, and Huang, Caines and Malhamé, Mean Field Games (MFGs) rely on a mean-field approximation to allow the number of players to grow to infinity. Traditional methods for solving these games generally rely on solving partial or stochastic differential equations with a full knowledge of the model. Recently, Reinforcement Learning (RL) has appeared promising to solve complex problems at scale. The combination of RL and MFGs is promising to solve games at a very large scale both in terms of population size and environment complexity. In this survey, we review the quickly growing recent literature on RL methods to learn equilibria and social optima in MFGs. We first identify the most common settings (static, stationary, and evolutive) of MFGs. We then present a general framework for classical iterative methods (based on best-response computation or policy evaluation) to solve MFGs in an exact way. Building on these algorithms and the connection with Markov Decision Processes, we explain how RL can be used to learn MFG solutions in a model-free way. Last, we present numerical illustrations on a benchmark problem, and conclude with some perspectives.

Learning in Mean Field Games: A Survey

TL;DR

Abstract

Paper Structure (127 sections, 98 equations, 22 figures)

This paper contains 127 sections, 98 equations, 22 figures.

Introduction
Mean field games
General intuition.
Example.
Some applications.
Numerical methods.
Learning
Two notions of learning.
Learning in games.
Learning in mean field games.
Outline of the survey
Definition of the problems
Useful notations.
Static MFG
Notations.
...and 112 more sections

Figures (22)

Figure 1: Organization of the MFG settings, depending on whether there is a time component and, if yes, whether the mean field is stationary or not.
Figure 2: Reinforcement learning environment: classical single-agent setup. Here, at iteration $n$, the current state of the MDP is $x_n$, the action taken by the agent is $a_n$, the new state is $x_{n+1} \sim p(\cdot|x_n,a_n)$ and the reward is $r_n = r(x_n,a_n)$. The new state $x_{n+1}$ is observed by the agent and is also used for the next step of the environment's evolution.
Figure 3: Environment for MFGs: Here, $x_n$ is the representative agent's state, $\mu_n$ is the population distribution, $a_n$ is the action taken by the agent. The new state is $x_{n+1} \sim p(\cdot|x_n,a_n,\mu_n)$ and the reward is $r_n = r(x_n,a_n,\mu_n)$. The new state $x_{n+1}$ is observed by the agent and is also used for the next step of the environment's evolution along with $\mu_n$.
Figure 4: Environment for MFC and MFMDP.
Figure 5: Crowd in a maze: Evolution of the distributions for the methods discussed in Section \ref{['sec:num-algo']}. Each row corresponds to one method, each column corresponds to one time step in $\{0, 15, 30, 45, 60, 69\}$.
...and 17 more figures

Theorems & Definitions (38)

Remark 1: On MFGs and non-atomic anonymous games
Definition 1: Static finite-population Nash equilibrium
Definition 2: Static MFG setting
Example 1
Example 2
Definition 3: Static mean field Nash equilibrium
Remark 2
Remark 3
Remark 4
Definition 4: Evolutive MFG setting
...and 28 more

Learning in Mean Field Games: A Survey

TL;DR

Abstract

Learning in Mean Field Games: A Survey

Authors

TL;DR

Abstract

Table of Contents

Figures (22)

Theorems & Definitions (38)