Table of Contents
Fetching ...

Operator Learning for Families of Finite-State Mean-Field Games

William Hofgard, Asaf Cohen, Mathieu Laurière

TL;DR

This work develops an operator-learning framework to solve parametric families of finite-state mean-field games by learning the flow map $\Phi(t, \eta, \kappa) = u^{\eta, \kappa}(t)$ that links the initial distribution and a parameterized terminal cost to the MFG value function. By generating training data via Picard iteration and training neural networks to approximate the flow map, the authors provide rigorous guarantees on approximation accuracy $O(K^{-1/(d+k+2)})$ and generalization $O(n^{-1/(d+k+4)} \log n)$ under Lipschitz regularity of the flow map. Theoretical results are complemented by numerical experiments on a cybersecurity benchmark and a high-dimensional quadratic MFG, illustrating accurate value function predictions and recovered population dynamics across a range of initial conditions and terminal-cost parameters. The framework enables efficient, generalizable evaluation of equilibria for entire families of finite-state MFGs, with potential extensions to continuous-state settings and more advanced operator-learning architectures for broader applicability.

Abstract

Finite-state mean-field games (MFGs) arise as limits of large interacting particle systems and are governed by an MFG system, a coupled forward-backward differential equation consisting of a forward Kolmogorov-Fokker-Planck (KFP) equation describing the population distribution and a backward Hamilton-Jacobi-Bellman (HJB) equation defining the value function. Solving MFG systems efficiently is challenging, with the structure of each system depending on an initial distribution of players and the terminal cost of the game. We propose an operator learning framework that solves parametric families of MFGs, enabling generalization without retraining for new initial distributions and terminal costs. We provide theoretical guarantees on the approximation error, parametric complexity, and generalization performance of our method, based on a novel regularity result for an appropriately defined flow map corresponding to an MFG system. We demonstrate empirically that our framework achieves accurate approximation for two representative instances of MFGs: a cybersecurity example and a high-dimensional quadratic model commonly used as a benchmark for numerical methods for MFGs.

Operator Learning for Families of Finite-State Mean-Field Games

TL;DR

This work develops an operator-learning framework to solve parametric families of finite-state mean-field games by learning the flow map that links the initial distribution and a parameterized terminal cost to the MFG value function. By generating training data via Picard iteration and training neural networks to approximate the flow map, the authors provide rigorous guarantees on approximation accuracy and generalization under Lipschitz regularity of the flow map. Theoretical results are complemented by numerical experiments on a cybersecurity benchmark and a high-dimensional quadratic MFG, illustrating accurate value function predictions and recovered population dynamics across a range of initial conditions and terminal-cost parameters. The framework enables efficient, generalizable evaluation of equilibria for entire families of finite-state MFGs, with potential extensions to continuous-state settings and more advanced operator-learning architectures for broader applicability.

Abstract

Finite-state mean-field games (MFGs) arise as limits of large interacting particle systems and are governed by an MFG system, a coupled forward-backward differential equation consisting of a forward Kolmogorov-Fokker-Planck (KFP) equation describing the population distribution and a backward Hamilton-Jacobi-Bellman (HJB) equation defining the value function. Solving MFG systems efficiently is challenging, with the structure of each system depending on an initial distribution of players and the terminal cost of the game. We propose an operator learning framework that solves parametric families of MFGs, enabling generalization without retraining for new initial distributions and terminal costs. We provide theoretical guarantees on the approximation error, parametric complexity, and generalization performance of our method, based on a novel regularity result for an appropriately defined flow map corresponding to an MFG system. We demonstrate empirically that our framework achieves accurate approximation for two representative instances of MFGs: a cybersecurity example and a high-dimensional quadratic model commonly used as a benchmark for numerical methods for MFGs.
Paper Structure (17 sections, 10 theorems, 67 equations, 21 figures, 3 tables, 2 algorithms)

This paper contains 17 sections, 10 theorems, 67 equations, 21 figures, 3 tables, 2 algorithms.

Key Result

Theorem 4.1

Under Assumptions asmp:mfg-uniqueness and asmp:param-terminal-cost, the flow map $\Phi : [0, T] \times {\mathcal{P}}([d]) \times {\mathcal{K}} \to \mathbb{R}^d$, given by $\Phi(t, \eta, \kappa) = u^{t_0, \eta, \kappa}(t, \cdot)$, is jointly Lipschitz in its inputs: there exists $C > 0$ such that for all $(t, \eta_1, \kappa_1), (s, \eta_2, \kappa_2) \in [0, T] \times {\mathcal{P}}([d]) \times {\m

Figures (21)

  • Figure 1: Given (1) sample initial distributions $\eta$ and cost parameters $\kappa$, we bypass the need to compute the optimal controls and flow of measures (Nash equilibrium) of an MFG by (2) solving the MFG system via Picard iteration. We then use the resulting trajectories to (3) approximate the solution operator for the family using a neural network, trained by minimizing an empirical loss over the samples from (1). In practice, the last step uses stochastic gradient descent (see Algorithm \ref{['alg:mfg-sampling']}).
  • Figure 2: Learned value function $\widehat{u}$ and true value function $u$ for four random initial distributions $\eta$ and final cost parameter $\kappa \in [0, 10]$, both drawn uniformly at random from ${\mathcal{P}}([4])$ and the interval $[0, 10]$, respectively. Each curve corresponds to one state in $\{DS, DI, US, UI\}$.
  • Figure 3: Learned flow of measures $\widehat{\mu}$ and true flow of measures $\mu$ for four random initial distributions $\eta$ and final cost parameter $\kappa \in [0, 10]$, both drawn uniformly at random from ${\mathcal{P}}([4])$ and the interval $[0, 10]$, respectively.
  • Figure 4: Comparison of true value functions $u$ and learned value functions $\widehat{u}$ for randomly sampled pairs $(\eta, \kappa)$ in dimensions $d = 3, 4, 5, 10$ respectively. Averages are taken across 5 trials, and shaded regions on approximate curves indicate error bars of one standard deviation, computed across trials.
  • Figure 5: Absolute errors of learned value functions $|u - \widehat{u}|$ for randomly sampled pairs $(\eta, \kappa)$ in dimensions $d = 3, 4, 5, 10$ respectively. Averages are taken across 5 trials, and shaded region s on approximate curves indicate error bars of one standard deviation, computed across trials.
  • ...and 16 more figures

Theorems & Definitions (19)

  • Definition 2.1
  • Definition 2.2
  • Theorem 4.1
  • Proposition 4.2
  • Corollary 4.3
  • Proposition 4.4
  • Corollary 4.5
  • Proposition C.1
  • Lemma E.1
  • Lemma E.2
  • ...and 9 more