Learning algorithms for mean field optimal control
H. Mete Soner, Josef Teichmann, Qinxin Yan
TL;DR
This work develops a neural-network based numerical method for mean-field optimal control by approximating the mean-field problem with a single large N-particle system that exploits exchangeability. The control policy can depend on time, an agent's state, and the empirical distribution of the population, with universal approximation on Wasserstein spaces enabling flexible policy representations. The authors prove convergence to the mean-field limit using propagation of chaos and bound the randomness via Rademacher complexity, providing a rigorous foundation for the method. Numerical experiments on Linear Quadratic, Kuramoto, and systemic risk models demonstrate accuracy, phase-transition behavior, and robustness across parameters and initial distributions, highlighting the approach's computational efficiency and applicability to high-dimensional, distribution-dependent control problems.
Abstract
We analyze an algorithm to numerically solve the mean-field optimal control problems by approximating the optimal feedback controls using neural networks with problem specific architectures. We approximate the model by an $N$-particle system and leverage the exchangeability of the particles to obtain substantial computational efficiency. In addition to several numerical examples, a convergence analysis is provided. We also developed a universal approximation theorem on Wasserstein spaces.
