Table of Contents
Fetching ...

Deep Generalized Schrödinger Bridges: From Image Generation to Solving Mean-Field Games

Guan-Horng Liu, Tianrong Chen, Evangelos A. Theodorou

TL;DR

This work extends Schrödinger Bridges to Generalized Schrödinger Bridges by incorporating a time- and space-dependent potential $V(x,t)$, enabling richer transport problems. It formulates a mesh-free, neural-SDE framework that harnesses the nonlinear Feynman–Kac lemma to convert PDE optimality into trainable stochastic dynamics, yielding likelihood and temporal-difference objectives that enforce boundary constraints while accounting for kinetic and potential energies. The authors introduce forward-backward Neural SDE representations, establish variational bounds via KL divergences, and propose joint or alternating training schemes to recover GSB solutions. Demonstrations on image generation and mean-field games illustrate the approach’s versatility in solving distribution-constrained stochastic control problems with deep learning tools, highlighting potential impacts in generative modeling and multi-agent systems.

Abstract

Generalized Schrödinger Bridges (GSBs) are a fundamental mathematical framework used to analyze the most likely particle evolution based on the principle of least action including kinetic and potential energy. In parallel to their well-established presence in the theoretical realms of quantum mechanics and optimal transport, this paper focuses on an algorithmic perspective, aiming to enhance practical usage. Our motivated observation is that transportation problems with the optimality structures delineated by GSBs are pervasive across various scientific domains, such as generative modeling in machine learning, mean-field games in stochastic control, and more. Exploring the intrinsic connection between the mathematical modeling of GSBs and the modern algorithmic characterization therefore presents a crucial, yet untapped, avenue. In this paper, we reinterpret GSBs as probabilistic models and demonstrate that, with a delicate mathematical tool known as the nonlinear Feynman-Kac lemma, rich algorithmic concepts, such as likelihoods, variational gaps, and temporal differences, emerge naturally from the optimality structures of GSBs. The resulting computational framework, driven by deep learning and neural networks, operates in a fully continuous state space (i.e., mesh-free) and satisfies distribution constraints, setting it apart from prior numerical solvers relying on spatial discretization or constraint relaxation. We demonstrate the efficacy of our method in generative modeling and mean-field games, highlighting its transformative applications at the intersection of mathematical modeling, stochastic process, control, and machine learning.

Deep Generalized Schrödinger Bridges: From Image Generation to Solving Mean-Field Games

TL;DR

This work extends Schrödinger Bridges to Generalized Schrödinger Bridges by incorporating a time- and space-dependent potential , enabling richer transport problems. It formulates a mesh-free, neural-SDE framework that harnesses the nonlinear Feynman–Kac lemma to convert PDE optimality into trainable stochastic dynamics, yielding likelihood and temporal-difference objectives that enforce boundary constraints while accounting for kinetic and potential energies. The authors introduce forward-backward Neural SDE representations, establish variational bounds via KL divergences, and propose joint or alternating training schemes to recover GSB solutions. Demonstrations on image generation and mean-field games illustrate the approach’s versatility in solving distribution-constrained stochastic control problems with deep learning tools, highlighting potential impacts in generative modeling and multi-agent systems.

Abstract

Generalized Schrödinger Bridges (GSBs) are a fundamental mathematical framework used to analyze the most likely particle evolution based on the principle of least action including kinetic and potential energy. In parallel to their well-established presence in the theoretical realms of quantum mechanics and optimal transport, this paper focuses on an algorithmic perspective, aiming to enhance practical usage. Our motivated observation is that transportation problems with the optimality structures delineated by GSBs are pervasive across various scientific domains, such as generative modeling in machine learning, mean-field games in stochastic control, and more. Exploring the intrinsic connection between the mathematical modeling of GSBs and the modern algorithmic characterization therefore presents a crucial, yet untapped, avenue. In this paper, we reinterpret GSBs as probabilistic models and demonstrate that, with a delicate mathematical tool known as the nonlinear Feynman-Kac lemma, rich algorithmic concepts, such as likelihoods, variational gaps, and temporal differences, emerge naturally from the optimality structures of GSBs. The resulting computational framework, driven by deep learning and neural networks, operates in a fully continuous state space (i.e., mesh-free) and satisfies distribution constraints, setting it apart from prior numerical solvers relying on spatial discretization or constraint relaxation. We demonstrate the efficacy of our method in generative modeling and mean-field games, highlighting its transformative applications at the intersection of mathematical modeling, stochastic process, control, and machine learning.
Paper Structure (18 sections, 3 theorems, 50 equations, 4 figures)

This paper contains 18 sections, 3 theorems, 50 equations, 4 figures.

Key Result

Theorem 1

Consider the random variables where $X_t$ solves eq:sde. Then, $Y_t$ and $\widehat{Y} _t$ solve The solutions to the PDEs in eq:sb-pde can be recovered via taking conditional expectation:

Figures (4)

  • Figure 1: Illustration of Schrödinger Bridge (SB) and Generalized SB on an 1-dimensional example w.r.t. a time-varying potential energy (purple contours). While SB seeks stochastic processes with minimal kinetic energy, GSB instead considers processes that jointly minimizes both kinetic and potential energies.
  • Figure 2: Overview of our methodology. We begin with the mathematical foundation of Generalized Schrödinger Bridges (GSBs) and SBs, which are characterized by the SDEs in \ref{['eq:sb-sde']} and their optimality conditions represented as PDEs in \ref{['eq:sb-pde']}. We propose a novel perspective, casting these SDEs as probabilistic models that construct stochastic transport maps between samples drawn from two different distributions, leading to a parametrization with Neural SDEs. On the other hand, the optimality PDEs conditions in \ref{['eq:sb-pde']} admit stochastic representation and can be transformed into computationally more tractable optimality SDEs in \ref{['eq:y', 'eq:yhat']} via a mathematical tool known as the nonlinear Feynman-Kac lemma ( i.e.,\ref{['thm:1']}). This transformation facilitates the emergence of common learning objectives such as likelihoods in \ref{['eq:nll', 'eq:nll2']}, variational gaps (\ref{['thm:2']}), and temporal differences in \ref{['eq:td', 'eq:td2']}. We show in \ref{['thm:3']} that optimizing the combined objectives provides necessary and sufficient conditions to GSB solutions, making them apt for learning approximate solutions of GSBs. In essence, our work reveals an intrinsic connection between mathematical modeling of GSBs, stochastic processes, control, and deep learning.
  • Figure 3: Results of generative modeling in image domains. (A) Simulations of the learned Neural SDEs, exemplified with a handwritten digit data set MNIST. While $X_t^\theta$ evolves along the forward time coordinate $t$, transforming a Gaussian noise to an image (in this case a digit 3), $\bar{X} _s^\phi$ instead moves backward along the coordinate $s \coloneqq 1-t$ and generates random noises. (B) Training progress of Cifar10 with respect to the Fréchet Inception Distance (FID) score. Notice how the the FID score stably decreases, improving the fidelity of the generated images over time. (C) Generated samples, i.e.,$X_1^\theta$, on all three tested data sets, including MNIST (left), CelebA (middle), and Cifar10 (right).
  • Figure 4: Results of mean-field games (MFGs) in crowd navigation scenarios. (A) The configurations of three distinct MFGs, namely GMM, V-neck, and S-tunnel, showcasing the boundary distributions $\mu$ (blue), $\nu$ (red), and restricted areas (gray). (B) Training progress of GMM, providing insights into how the likelihood and temporal difference (TD) objectives contribute to the combined training objective. We report the average over 5 random seeds. (C) The simulation results on GMM, where seven population snapshots are displayed at regular intervals between $t\in[0,1]$, each with a different color. (D and E) Additionally, an ablation study is conducted on the entropy interaction cost and diffusion coefficient $\sigma$ for V-neck and S-tunnel, respectively. For optimal visualization, four population snapshots are presented for V-neck, and seven for S-tunnel.

Theorems & Definitions (3)

  • Theorem 1
  • Theorem 2
  • Theorem 3