A deep learning algorithm for computing mean field control problems via forward-backward score dynamics

Mo Zhou; Stanley Osher; Wuchen Li

A deep learning algorithm for computing mean field control problems via forward-backward score dynamics

Mo Zhou, Stanley Osher, Wuchen Li

TL;DR

The paper addresses solving mean field control (MFC) problems with diffusion by introducing deterministic forward-backward score dynamics based on the score $\nabla_x \log \rho$, replacing stochastic FBSDEs. A neural network parametrizes the adjoint $\phi(t,x)$ and KDE-based density estimates supply the score, enabling a least-squares loss that enforces the forward probability flow and the backward HJB condition along sample trajectories. The method is demonstrated on three problems—entropy-energy MFC, linear-quadratic MFC, and systemic risk—showing accurate recovery of $\phi$ and the density $\rho$ with favorable Wasserstein errors compared to BSDE baselines. This approach provides a scalable, Brownian-motion-free path for computing MFC solutions and suggests directions for theoretical analysis, improved density estimation, and extensions to cases with diffusion-control leading to fully nonlinear HJB equations.

Abstract

We propose a deep learning approach to compute mean field control problems with individual noises. The problem consists of the Fokker-Planck (FP) equation and the Hamilton-Jacobi-Bellman (HJB) equation. Using the differential of the entropy, namely the score function, we first formulate the deterministic forward-backward characteristics for the mean field control system, which is different from the classical forward-backward stochastic differential equations (FBSDEs). We further apply the neural network approximation to fit the proposed deterministic characteristic lines. Numerical examples, including the control problem with entropy potential energy, the linear quadratic regulator, and the systemic risks, demonstrate the effectiveness of the proposed method.

A deep learning algorithm for computing mean field control problems via forward-backward score dynamics

TL;DR

The paper addresses solving mean field control (MFC) problems with diffusion by introducing deterministic forward-backward score dynamics based on the score

, replacing stochastic FBSDEs. A neural network parametrizes the adjoint

and KDE-based density estimates supply the score, enabling a least-squares loss that enforces the forward probability flow and the backward HJB condition along sample trajectories. The method is demonstrated on three problems—entropy-energy MFC, linear-quadratic MFC, and systemic risk—showing accurate recovery of

and the density

with favorable Wasserstein errors compared to BSDE baselines. This approach provides a scalable, Brownian-motion-free path for computing MFC solutions and suggests directions for theoretical analysis, improved density estimation, and extensions to cases with diffusion-control leading to fully nonlinear HJB equations.

Abstract

Paper Structure (12 sections, 3 theorems, 47 equations, 4 figures, 2 tables, 1 algorithm)

This paper contains 12 sections, 3 theorems, 47 equations, 4 figures, 2 tables, 1 algorithm.

Introduction
Formulation of the score-based mean field control problem
Numerical method
Construction of loss function
Density estimation
Numerical implementation of the score-based MFC solver
Numerical examples
An MFC example with entropy potential energy
An LQ example
Systemic risk
Conclusion and future directions
Details for the numerical implementation

Key Result

Proposition 1

Let Assumption assump:convex_L hold. The optimal velocity field for the MFC problem is given by Here $\phi\colon [0,T]\times \mathbb{R}^d\rightarrow\mathbb{R}$ is a function that satisfies the following FP--HJB system with the density function $\rho(t,x)$ where $f(t,x,\rho(t,x)) = \dfrac{\partial F}{\partial \rho}(t,x,\rho(t,x))$.

Figures (4)

Figure 1: Numerical results for the MFC problem with log density running cost. The first row illustrates the results in $1$ dimension, including the training curve, plot of $\phi(0,\cdot)$, density plot, and score plot. The second row presents the results in $2$ dimensions, including the training curve, density plot for $\phi(0,x_0)$, and a contour plot comparison for the density $\rho(T,x_T)$.
Figure 2: Numerical results for the LQ MFC problem. The first row presents results in $1$ dimension, including the training curve, plot of $\phi(0,\cdot)$, variance plot, and score plot. The second row is results for $2$ dimensions, including the training curve, density plot for $\phi(0,x_0)$, and a contour plot comparison for the density $\rho(T,x_T)$.
Figure 3: Particle trajectories for the true score dynamic (first row), approximated score dynamic (second row), and stochastic dynamic (FBSDE) with true velocity (third row). The score dynamic demonstrates a more structured behavior.
Figure 4: Numerical results for the systemic risk example in $1$ dimension, including the training curve, plot of $\phi(0,\cdot)$, density plot, and score plot.

Theorems & Definitions (8)

Proposition 1
proof
Remark 1
Proposition 2
proof
Remark 2
Proposition 3
proof

A deep learning algorithm for computing mean field control problems via forward-backward score dynamics

TL;DR

Abstract

A deep learning algorithm for computing mean field control problems via forward-backward score dynamics

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (8)