Convergence of Actor-Critic Learning for Mean Field Games and Mean Field Control in Continuous Spaces
Jean-Pierre Fouque, Mathieu Laurière, Mengrui Zhang
TL;DR
The work delivers a rigorous convergence analysis for a deep actor-critic method solving infinite-horizon mean-field problems in continuous spaces, distinguishing between Mean Field Game and Mean Field Control regimes via a two-timescale learning-rate scheme. A discretization-bin approach is introduced for the mean-field control limit, and the analysis extends to Mean Field Control Games, with both idealized three-time-scale results and practical stochastic-approximation algorithms. Theoretical results show convergence to MF equilibria (MFG) or MF optima (MFC), complemented by extensive numerical validation on linear-quadratic benchmarks in 1D and 2D. The study provides a unified, scalable framework for learning in large populations and demonstrates the practicality of IH-MF-AC and IH-MFCG-AC for complex cooperative-competitive settings. Future work includes extending the methods to finite-horizon problems and exploring broader function-approximation schemes.
Abstract
We establish the convergence of the deep actor-critic reinforcement learning algorithm presented in [Angiuli et al., 2023a] in the setting of continuous state and action spaces with an infinite discrete-time horizon. This algorithm provides solutions to Mean Field Game (MFG) or Mean Field Control (MFC) problems depending on the ratio between two learning rates: one for the value function and the other for the mean field term. In the MFC case, to rigorously identify the limit, we introduce a discretization of the state and action spaces, following the approach used in the finite-space case in [Angiuli et al., 2023b]. The convergence proofs rely on a generalization of the two-timescale framework introduced in [Borkar, 1997]. We further extend our convergence results to Mean Field Control Games, which involve locally cooperative and globally competitive populations. Finally, we present numerical experiments for linear-quadratic problems in one and two dimensions, for which explicit solutions are available.
