Deep Reinforcement Learning for Infinite Horizon Mean Field Problems in Continuous Spaces
Andrea Angiuli, Jean-Pierre Fouque, Ruimeng Hu, Alan Raydan
TL;DR
This work develops IH-MF-AC, an online, model-free actor-critic algorithm that integrates a score-based representation of the mean-field distribution with Langevin sampling to solve infinite-horizon continuous-space mean field games and mean field control problems. By tuning the relative learning rates among the actor, critic, and score networks, the method can converge to either the MF equilibrium (MFG) or the MF optimum (MFC), and it can be extended to mixed MF control games (MFCG). The approach is validated on linear-quadratic benchmarks, yielding analytic MF-G/MF-C solutions and demonstrating stable recovery of the optimal mean-field distributions and controls, with insights into stability and exploration. The work advances model-free techniques for large-population mean-field problems in continuous spaces and provides a scalable framework for future research in finite-horizon extensions and rigorous convergence guarantees.
Abstract
We present the development and analysis of a reinforcement learning (RL) algorithm designed to solve continuous-space mean field game (MFG) and mean field control (MFC) problems in a unified manner. The proposed approach pairs the actor-critic (AC) paradigm with a representation of the mean field distribution via a parameterized score function, which can be efficiently updated in an online fashion, and uses Langevin dynamics to obtain samples from the resulting distribution. The AC agent and the score function are updated iteratively to converge, either to the MFG equilibrium or the MFC optimum for a given mean field problem, depending on the choice of learning rates. A straightforward modification of the algorithm allows us to solve mixed mean field control games (MFCGs). The performance of our algorithm is evaluated using linear-quadratic benchmarks in the asymptotic infinite horizon framework.
