Table of Contents
Fetching ...

Foundations of Schrödinger Bridges for Generative Modeling

Sophia Tang

Abstract

At the core of modern generative modeling frameworks, including diffusion models, score-based models, and flow matching, is the task of transforming a simple prior distribution into a complex target distribution through stochastic paths in probability space. Schrödinger bridges provide a unifying principle underlying these approaches, framing the problem as determining an optimal stochastic bridge between marginal distribution constraints with minimal-entropy deviations from a pre-defined reference process. This guide develops the mathematical foundations of the Schrödinger bridge problem, drawing on optimal transport, stochastic control, and path-space optimization, and focuses on its dynamic formulation with direct connections to modern generative modeling. We build a comprehensive toolkit for constructing Schrödinger bridges from first principles, and show how these constructions give rise to generalized and task-specific computational methods.

Foundations of Schrödinger Bridges for Generative Modeling

Abstract

At the core of modern generative modeling frameworks, including diffusion models, score-based models, and flow matching, is the task of transforming a simple prior distribution into a complex target distribution through stochastic paths in probability space. Schrödinger bridges provide a unifying principle underlying these approaches, framing the problem as determining an optimal stochastic bridge between marginal distribution constraints with minimal-entropy deviations from a pre-defined reference process. This guide develops the mathematical foundations of the Schrödinger bridge problem, drawing on optimal transport, stochastic control, and path-space optimization, and focuses on its dynamic formulation with direct connections to modern generative modeling. We build a comprehensive toolkit for constructing Schrödinger bridges from first principles, and show how these constructions give rise to generalized and task-specific computational methods.
Paper Structure (72 sections, 79 theorems, 860 equations, 24 figures)

This paper contains 72 sections, 79 theorems, 860 equations, 24 figures.

Key Result

Lemma 1.4

Given two joint probability measures $\pi_{X,Y}, \pi'_{X,Y}\in \mathcal{P}(\mathcal{X}\times \mathcal{Y})$ that are absolutely continuous $\pi_{0,T}\ll \pi'_{0,T}$. Denote the $\mathcal{X}$-marginals as $\pi_X:= \int_{\mathcal{Y}}\pi_{0,T}d\boldsymbol{y}$ and $\pi'_X:= \int_{\mathcal{Y}}\pi'_{0,T}d\

Figures (24)

  • Figure 1: Illustration of Relative Entropy or Kullback-Leibler (KL) Divergence. The KL divergence $\text{KL}(p\|q)$ between two 1-dimensional probability distributions $p, q\in \mathcal{P}(\mathcal{X})$ on the state space $\mathcal{X}\subseteq\mathbb{R}$. The left shows the independent distributions and the right shows the signed contributions of $p\log \frac{dp}{dq}$ which yields the KL divergence when integrated, quantifying how well $q$ approximates $p$.
  • Figure 2: Illustration of Sinkhorn's Algorithm. Starting from an initial potential (e.g. $\varphi_0:=0$), Sinkhorn's algorithm alternates updates of the dual potentials $\varphi_n$ and $\hat{\varphi}_n$ via log-integral transforms involving the transport cost $c(\boldsymbol{x},\boldsymbol{y})$ and the marginal constraints $\pi_0$ and $\pi_T$. Each alternating step enforces one marginal constraint while preserving the entropic structure, and the sequence $(\varphi_n, \hat{\varphi}_n)$ converges to the optimal dual pair $(\varphi^\star, \hat{\varphi}^\star)$ that uniquely defines the static Schrödinger bridge coupling $\pi^\star_{0,T}$.
  • Figure 3: Comparison Between Static and Dynamic Optimal Transport. Illustration of the relationship between the static and dynamic formulations of optimal transport between two marginal distributions $\pi_0$ and $\pi_T$. The static formulation minimizes the transport cost directly between endpoints via the quadratic cost $\|\boldsymbol{x}_T-\boldsymbol{x}_0\|^2$, while the dynamic (Benamou–Brenier) formulation instead seeks a time-dependent velocity field $\boldsymbol{v}_t(\boldsymbol{x})$ that continuously transports mass from $\pi_0$ to $\pi_T$ while minimizing the kinetic energy $\int_0^T\|\boldsymbol{v}_t(\boldsymbol{x})\|^2 dt$. The white trajectory represents a particle path under the optimal flow, illustrating how dynamic OT realizes the same optimal coupling as static OT through continuous mass evolution.
  • Figure 4: Brownian Motion and Controlled Itô Processes. Illustration of one-dimensional stochastic trajectories generated by two stochastic differential equations (SDEs) starting from $\boldsymbol{X}_0=0$ over the time interval $t\in [0,1]$. Left: Sample paths of pure Brownian motion of the form $d\boldsymbol{X}_t=\sigma_td\boldsymbol{B}_t$, where increments are Gaussian with variance proportional to the timestep $\boldsymbol{B}_{t+\Delta t}=\boldsymbol{B}_t+\sqrt{\Delta t}\boldsymbol{z}$ with $\boldsymbol{z}\sim \mathcal{N}(0,\boldsymbol{I}_d)$. Right: Sample paths of a controlled Itô process of the form $d\boldsymbol{X}_t=(\boldsymbol{f}(\boldsymbol{X}_t,t)+\sigma_t\boldsymbol{u}(\boldsymbol{X}_t,t))+\sigma_td\boldsymbol{B}_t$, where we set $\boldsymbol{f}\equiv 0$ and $\boldsymbol{u}(\boldsymbol{x},t):=\frac{2-\boldsymbol{x}}{T-t+\epsilon}$ which pulls the process to $\boldsymbol{X}_T=2$.
  • Figure 5: Change of Measure Using Radon-Nikodym Derivative. Sample trajectories of a diffusion process are shown under two probability measures. Top left: trajectories under the reference measure $\mathbb{Q}$, governed by the uncontrolled reference drift $\boldsymbol{f}(\boldsymbol{X}_t,t)$. Bottom left: trajectories under the controlled measure $\mathbb{P}^u$, where the dynamics include an additional control drift $\boldsymbol{u}(\boldsymbol{X}_t,t)$. Right: the same reference trajectories are reweighted according to the Radon–Nikodym derivative $\frac{\mathrm{d}\mathbb{P}^u}{\mathrm{d}\mathbb{Q}}$, which assigns higher likelihood to paths aligned with the control and lower likelihood to paths that oppose it. The color intensity represents the log-likelihood ratio $\log \frac{\mathrm{d}\mathbb{P}^u}{\mathrm{d}\mathbb{Q}}$, illustrating how the controlled dynamics can be interpreted as a change of measure on the same underlying path space.
  • ...and 19 more figures

Theorems & Definitions (138)

  • Definition 1.1: Monge's Optimal Mass Transport Problem
  • Definition 1.2: Kantorovich's Optimal Mass Transport Problem
  • Definition 1.3: Entropy Between Probability Measures
  • Lemma 1.4: Chain Rule of KL Divergences
  • Lemma 1.5: Data Processing Inequality
  • Definition 1.6: Entropic Optimal Transport (EOT) Problem
  • Definition 1.7: Static Schrödinger Bridge Problem
  • Proposition 1.8: Schrödinger Potentials
  • Proposition 1.9: Solution to Static SB Problem
  • Theorem 1.10: Dual Formulation of Static SB Problem
  • ...and 128 more