Table of Contents
Fetching ...

Light and Optimal Schrödinger Bridge Matching

Nikita Gushchin, Sergei Kholkin, Evgeny Burnaev, Alexander Korotin

TL;DR

This work addresses learning Schrödinger Bridges (SB) between distributions from an input transport plan by introducing optimal Schrödinger Bridge matching (OP), which yields the SB in a single projection and is agnostic to the input plan. It then instantiates a practical solver, LightSB-M, using a Gaussian-mixture Schrödinger potential to obtain closed-form drifts and fast sampling via the Brownian bridge self-similarity, linking to energy-based modeling. Theoretical guarantees show OP returns the true SB, and empirical results across SB benchmarks, single-cell trajectories, and unpaired image translation demonstrate competitive accuracy with substantial speedups. Overall, the approach offers a principled, efficient framework for SB/EOT tasks with broad potential impact on diffusion-based generative modeling and related domains.

Abstract

Schrödinger Bridges (SB) have recently gained the attention of the ML community as a promising extension of classic diffusion models which is also interconnected to the Entropic Optimal Transport (EOT). Recent solvers for SB exploit the pervasive bridge matching procedures. Such procedures aim to recover a stochastic process transporting the mass between distributions given only a transport plan between them. In particular, given the EOT plan, these procedures can be adapted to solve SB. This fact is heavily exploited by recent works giving rise to matching-based SB solvers. The cornerstone here is recovering the EOT plan: recent works either use heuristical approximations (e.g., the minibatch OT) or establish iterative matching procedures which by the design accumulate the error during the training. We address these limitations and propose a novel procedure to learn SB which we call the \textbf{optimal Schrödinger bridge matching}. It exploits the optimal parameterization of the diffusion process and provably recovers the SB process \textbf{(a)} with a single bridge matching step and \textbf{(b)} with arbitrary transport plan as the input. Furthermore, we show that the optimal bridge matching objective coincides with the recently discovered energy-based modeling (EBM) objectives to learn EOT/SB. Inspired by this observation, we develop a light solver (which we call LightSB-M) to implement optimal matching in practice using the Gaussian mixture parameterization of the adjusted Schrödinger potential. We experimentally showcase the performance of our solver in a range of practical tasks. The code for our solver can be found at https://github.com/SKholkin/LightSB-Matching.

Light and Optimal Schrödinger Bridge Matching

TL;DR

This work addresses learning Schrödinger Bridges (SB) between distributions from an input transport plan by introducing optimal Schrödinger Bridge matching (OP), which yields the SB in a single projection and is agnostic to the input plan. It then instantiates a practical solver, LightSB-M, using a Gaussian-mixture Schrödinger potential to obtain closed-form drifts and fast sampling via the Brownian bridge self-similarity, linking to energy-based modeling. Theoretical guarantees show OP returns the true SB, and empirical results across SB benchmarks, single-cell trajectories, and unpaired image translation demonstrate competitive accuracy with substantial speedups. Overall, the approach offers a principled, efficient framework for SB/EOT tasks with broad potential impact on diffusion-based generative modeling and related domains.

Abstract

Schrödinger Bridges (SB) have recently gained the attention of the ML community as a promising extension of classic diffusion models which is also interconnected to the Entropic Optimal Transport (EOT). Recent solvers for SB exploit the pervasive bridge matching procedures. Such procedures aim to recover a stochastic process transporting the mass between distributions given only a transport plan between them. In particular, given the EOT plan, these procedures can be adapted to solve SB. This fact is heavily exploited by recent works giving rise to matching-based SB solvers. The cornerstone here is recovering the EOT plan: recent works either use heuristical approximations (e.g., the minibatch OT) or establish iterative matching procedures which by the design accumulate the error during the training. We address these limitations and propose a novel procedure to learn SB which we call the \textbf{optimal Schrödinger bridge matching}. It exploits the optimal parameterization of the diffusion process and provably recovers the SB process \textbf{(a)} with a single bridge matching step and \textbf{(b)} with arbitrary transport plan as the input. Furthermore, we show that the optimal bridge matching objective coincides with the recently discovered energy-based modeling (EBM) objectives to learn EOT/SB. Inspired by this observation, we develop a light solver (which we call LightSB-M) to implement optimal matching in practice using the Gaussian mixture parameterization of the adjusted Schrödinger potential. We experimentally showcase the performance of our solver in a range of practical tasks. The code for our solver can be found at https://github.com/SKholkin/LightSB-Matching.
Paper Structure (32 sections, 5 theorems, 43 equations, 8 figures, 8 tables, 1 algorithm)

This paper contains 32 sections, 5 theorems, 43 equations, 8 figures, 8 tables, 1 algorithm.

Key Result

Theorem 3.1

The optimal projection of a reciprocal process $T_{\pi}$, given by a joint distribution $\pi \in \Pi(p_0, p_1)$ leads to the Schrödinger Bridge $T^{*}$ between the distributions $p_0$ and $p_1$, i.e.:

Figures (8)

  • Figure 1: Unpaired adult$\rightarrow$child translation with our LightSB-M solver applied in the latent space of ALAE pidhorskyi2020adversarial for 1024x1024 FFHQ images karras2019style. Our LightSB-M solver converges on 4 cpu cores in several minutes.
  • Figure 2: The process $S_{\theta}$ learned with LightSB-M (ours) in Gaussian$\!\rightarrow\!$Swiss roll example (\ref{['sec:exp-2D']}).
  • Figure 3: Unpaired translation between subsets of FFHQ dataset (1024x1024) performed by various SB solvers (\ref{['sec:exp-image']}) in the latent space of ALAE pidhorskyi2020adversarial.
  • Figure 4: Dynamic KL evaluation. $\mathcal{L}^{2}_{\text{fwd}}[t]$ and $\mathcal{L}^{2}_{\text{bwd}}[t]$ values w.r.t. time for different algorithms. Results denoted as "Best solver (benchmark)" are taken from the benchmark paper gushchin2023building
  • Figure 5: The process $S_{\theta}$ learned by HardSB-M (ours) with MC drift estimator Gaussian$\!\rightarrow\!$Swiss roll example.
  • ...and 3 more figures

Theorems & Definitions (10)

  • Theorem 3.1: OP of a reciprocal process
  • Theorem 3.2: Tractable objective for the OP
  • Theorem 3.3: Equivalence to EgNOT/LightSB objective
  • proof : Proof of Theorem \ref{['thm:optimal-projection']}
  • proof : Proof of Theorem \ref{['thm:tractable-objective-of-optimal-projection']}
  • proof : Proof of Theorem \ref{['thm:eqviv-lightsb']}
  • Theorem 3.1: HardSB-M drift expression
  • Theorem 3.2: HardSB-M loss gradient expression
  • proof : Proof of Theorem \ref{['th:hardsbm_drift']}
  • proof : Proof of Theorem \ref{['th:hardsbm_loss_grad']}