Table of Contents
Fetching ...

Generalized Schrödinger Bridge Matching

Guan-Horng Liu, Yaron Lipman, Maximilian Nickel, Brian Karrer, Evangelos A. Theodorou, Ricky T. Q. Chen

TL;DR

The paper tackles distribution matching when the between-sample marginals are specified implicitly via a task-driven objective by introducing Generalized Schrödinger Bridge (GSB) and its matching algorithm GSBM. The approach casts the problem as alternating optimization between a drift U and a Conditional stochastic optimal control (CondSOC) problem for the intermediate marginals, solved with a Gaussian path approximation and path-integral debiasing to preserve feasibility and enable scalable training. Key contributions include a generalization of Schrödinger Bridge matching to nontrivial state costs, a principled CondSOC formulation, and a simulation-free, parallelizable algorithm with local convergence guarantees. Empirical results across crowd navigation, LiDAR geometry, unpaired image translation, and high-dimensional opinion dynamics demonstrate improved stability, interpretability, and performance over prior SB-based methods, while providing practical guidance for incorporating task-specific optimality structures into diffusion-model training. The work broadens the applicability of diffusion-based distribution matching to domains demanding complex, task-aligned transport costs, and provides code for reproducible use.

Abstract

Modern distribution matching algorithms for training diffusion or flow models directly prescribe the time evolution of the marginal distributions between two boundary distributions. In this work, we consider a generalized distribution matching setup, where these marginals are only implicitly described as a solution to some task-specific objective function. The problem setup, known as the Generalized Schrödinger Bridge (GSB), appears prevalently in many scientific areas both within and without machine learning. We propose Generalized Schrödinger Bridge Matching (GSBM), a new matching algorithm inspired by recent advances, generalizing them beyond kinetic energy minimization and to account for task-specific state costs. We show that such a generalization can be cast as solving conditional stochastic optimal control, for which efficient variational approximations can be used, and further debiased with the aid of path integral theory. Compared to prior methods for solving GSB problems, our GSBM algorithm better preserves a feasible transport map between the boundary distributions throughout training, thereby enabling stable convergence and significantly improved scalability. We empirically validate our claims on an extensive suite of experimental setups, including crowd navigation, opinion depolarization, LiDAR manifolds, and image domain transfer. Our work brings new algorithmic opportunities for training diffusion models enhanced with task-specific optimality structures. Code available at https://github.com/facebookresearch/generalized-schrodinger-bridge-matching

Generalized Schrödinger Bridge Matching

TL;DR

The paper tackles distribution matching when the between-sample marginals are specified implicitly via a task-driven objective by introducing Generalized Schrödinger Bridge (GSB) and its matching algorithm GSBM. The approach casts the problem as alternating optimization between a drift U and a Conditional stochastic optimal control (CondSOC) problem for the intermediate marginals, solved with a Gaussian path approximation and path-integral debiasing to preserve feasibility and enable scalable training. Key contributions include a generalization of Schrödinger Bridge matching to nontrivial state costs, a principled CondSOC formulation, and a simulation-free, parallelizable algorithm with local convergence guarantees. Empirical results across crowd navigation, LiDAR geometry, unpaired image translation, and high-dimensional opinion dynamics demonstrate improved stability, interpretability, and performance over prior SB-based methods, while providing practical guidance for incorporating task-specific optimality structures into diffusion-model training. The work broadens the applicability of diffusion-based distribution matching to domains demanding complex, task-aligned transport costs, and provides code for reproducible use.

Abstract

Modern distribution matching algorithms for training diffusion or flow models directly prescribe the time evolution of the marginal distributions between two boundary distributions. In this work, we consider a generalized distribution matching setup, where these marginals are only implicitly described as a solution to some task-specific objective function. The problem setup, known as the Generalized Schrödinger Bridge (GSB), appears prevalently in many scientific areas both within and without machine learning. We propose Generalized Schrödinger Bridge Matching (GSBM), a new matching algorithm inspired by recent advances, generalizing them beyond kinetic energy minimization and to account for task-specific state costs. We show that such a generalization can be cast as solving conditional stochastic optimal control, for which efficient variational approximations can be used, and further debiased with the aid of path integral theory. Compared to prior methods for solving GSB problems, our GSBM algorithm better preserves a feasible transport map between the boundary distributions throughout training, thereby enabling stable convergence and significantly improved scalability. We empirically validate our claims on an extensive suite of experimental setups, including crowd navigation, opinion depolarization, LiDAR manifolds, and image domain transfer. Our work brings new algorithmic opportunities for training diffusion models enhanced with task-specific optimality structures. Code available at https://github.com/facebookresearch/generalized-schrodinger-bridge-matching
Paper Structure (32 sections, 13 theorems, 62 equations, 24 figures, 3 tables, 6 algorithms)

This paper contains 32 sections, 13 theorems, 62 equations, 24 figures, 3 tables, 6 algorithms.

Key Result

Proposition 0

The unique minimizer to Stage 1 coincides with $\nabla s_t^\star(X_t)$.

Figures (24)

  • Figure 1: Solutions to (\ref{['eq:6']}) w.r.t. different $V_t$, and how they link to different methods, including Rectified flow liu2023flow, DSBM shi2023diffusion, and our GSBM.
  • Figure 2: Example of spline optimization (Alg. \ref{['alg:spline']}) for $\mu_t\in{\mathbb{R}}^2, \gamma_t\in{\mathbb{R}}$, and the resulting CondSOC (\ref{['eq:6']}) solution.
  • Figure 3: Feasibility vs. optimality on three crowd navigation tasks with mean-field cost.
  • Figure 4: Simulation of SDEs with the ${u_t^\theta}$ after long training. Notice how DeepGSB diverges drastically from our GSBM, which satisfies feasibility at all time.
  • Figure 5: Crowd navigation over a LiDAR surface. Height is denoted by the grayscale color.
  • ...and 19 more figures

Theorems & Definitions (23)

  • Proposition 0: Stage 1
  • Proposition 0: Stage 2; Conditional stochastic optimal control; CondSOC
  • Lemma 0: Analytic solution to 6 for quadratic $V$ and $\sigma>0$
  • Proposition 0: Path integral solution to 6
  • Theorem 1: Local convergence
  • Theorem 2
  • Proposition 2: Stage 1
  • proof
  • Proposition 2: Stage 2; Conditional stochastic optimal control; CondSOC
  • proof
  • ...and 13 more