A Policy Iteration Method for Inverse Mean Field Games

Kui Ren; Nathan Soedjak; Shanyin Tong

A Policy Iteration Method for Inverse Mean Field Games

Kui Ren, Nathan Soedjak, Shanyin Tong

TL;DR

The paper tackles reconstructing the obstacle function $b(x)$ in inverse mean-field games from partial observations of the value function. It introduces a policy-iteration scheme that decouples the nonlinear forward-backward MFG system into linear FP solves and linear inverse problems, updating the policy via $q^{(k+1)}=H_p( abla u^{(k)})$. The authors prove uniform convergence and a linear rate of convergence for small time horizons under quadratic Hamiltonians, and demonstrate through 1D and 2D numerics that the method is more efficient and accurate than direct PDE-constrained least-squares, including robustness to noise. The work offers a scalable, theoretically grounded approach for calibrating MFG models using observed value data with practical implications for economics, crowd dynamics, and related fields.

Abstract

We propose a policy iteration method to solve an inverse problem for a mean-field game (MFG) model, specifically to reconstruct the obstacle function in the game from the partial observation data of value functions, which represent the optimal costs for agents. The proposed approach decouples this complex inverse problem, which is an optimization problem constrained by a coupled nonlinear forward and backward PDE system in the MFG, into several iterations of solving linear PDEs and linear inverse problems. This method can also be viewed as a fixed-point iteration that simultaneously solves the MFG system and inversion. We prove its linear rate of convergence. In addition, numerical examples in 1D and 2D, along with performance comparisons to a direct least-squares method, demonstrate the superior efficiency and accuracy of the proposed method for solving inverse MFGs.

A Policy Iteration Method for Inverse Mean Field Games

TL;DR

The paper tackles reconstructing the obstacle function

in inverse mean-field games from partial observations of the value function. It introduces a policy-iteration scheme that decouples the nonlinear forward-backward MFG system into linear FP solves and linear inverse problems, updating the policy via

. The authors prove uniform convergence and a linear rate of convergence for small time horizons under quadratic Hamiltonians, and demonstrate through 1D and 2D numerics that the method is more efficient and accurate than direct PDE-constrained least-squares, including robustness to noise. The work offers a scalable, theoretically grounded approach for calibrating MFG models using observed value data with practical implications for economics, crowd dynamics, and related fields.

Abstract

Paper Structure (20 sections, 8 theorems, 56 equations, 7 figures)

This paper contains 20 sections, 8 theorems, 56 equations, 7 figures.

Introduction
Preliminaries
Mean field games and their inverse problems
Policy iteration for solving MFGs
Policy iteration for inverse MFG problem
Direct application to least-squares of data misfit
Proposed algorithm: policy iteration method for inverse MFGs
Convergence of the policy iteration method for inverse MFG problems
Notations
Uniform convergence theorem
Linear rate of convergence
Numerical experiments
Reconstruction of a one-dimensional obstacle function
Reconstruction of a two-dimensional obstacle function
Conclusion and discussions
...and 5 more sections

Key Result

Theorem 4.1

Under the assumptions: There exists a $\bar{T}$ such that $\forall T\in (0,\bar{T}]$, the sequence $\{b^{(k)}\}_{k\ge 0}$, generated by the policy iteration method for the inverse MFG with $\partial_t u(x, T)$ data and a quadratic Hamiltonian $H(p)=\frac{1}{2}|p|^2$, converges uniformly on $\mathbb T^d$ to a solution $b^*

Figures (7)

Figure 1: Reconstruction results of policy iteration method for the one-dimensional inverse MFG problem \ref{['eq:b-1d']}. Left: reconstructed $b$ for difference cases: (i) using $u(x, 0)$ data (blue dashed line) and (ii) using $\partial_t u(x, T)$ (red dotted line), compared with the true obstacle function $b^*$ (yellow solid line). Middle: the absolute error $|b(x) - b^*(x)|$ for different cases. Right: the decay of the error $\|b^{(k)} -b^*\|_{L^2}$ with respect to the number of iterations $k$ (displayed on a logarithmic scale on y-axis).
Figure 2: Comparison of reconstruction time and relative error between policy iteration and direct least-squares method for the one-dimensional inverse MFG problem \ref{['eq:b-1d']}.
Figure 3: Reconstruction results of the policy iteration method for the one-dimensional inverse MFG problem \ref{['eq:b-1d']} with noisy data. The blue dashed line is for reconstruction using $u(x,0)$ data with $1\%$ noise, and the red dotted line is for reconstruction using extra $u(x,0.2)$ data with $1\%$ noise. Left: reconstructed obstacle function $b$ compared with true $b^*$ (solid yellow line); Middle: error $|b(x)-b^*(x)|$; Right: the corresponding reconstructed $u(x,0)$, from solution of MFG using reconstructed $b$.
Figure 4: Reconstruction results of policy iteration method for the one-dimensional inverse MFG problem \ref{['eq:b-1d']}, and for longer time periods $T\in\{10, 50\}$. Left: reconstructed $b$ for difference cases and and different terminal time $T$: (i) using $u(x, 0)$ data (blue dashed lines) and (ii) using $\partial_t u(x, T)$ (red dotted lines), compared with the true obstacle function $b^*$ (yellow solid line). Middle: the absolute error $|b(x) - b^*(x)|$ for different cases. Right: the decay of the error $\|b^{(k)} -b^*\|_{L^2}$ with respect to the number of iterations $k$ (displayed on a logarithmic scale on y-axis).
Figure 5: Comparison of reconstruction time and relative error between policy iteration and direct least-squares method for the one-dimensional inverse MFG problem \ref{['eq:b-1d']} for longer time horizons $T$.
...and 2 more figures

Theorems & Definitions (10)

Theorem 4.1
proof : Proof of \ref{['THM:Convergence']}
Theorem 4.2
proof : Proof of \ref{['thm:linear-conv']}
Lemma A.1: Lem 2.3 of CiGiMa-DGA20
Lemma A.2: Lem 2.4 of CiGiMa-DGA20
Proposition A.3: Prop 2.5 of CiGiMa-DGA20
Lemma A.4: Lem 2.4 of LaSoTa-AMO23
Proposition A.5: Prop 2.7 of LaSoTa-AMO23, Prop 2.6 of CiGiMa-DGA20
Proposition A.6: Prop 2.8 of LaSoTa-AMO23, Thm 4 of BoHaPf-AMO21

A Policy Iteration Method for Inverse Mean Field Games

TL;DR

Abstract

A Policy Iteration Method for Inverse Mean Field Games

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (10)