Table of Contents
Fetching ...

Efficient Generative Modeling beyond Memoryless Diffusion via Adjoint Schrödinger Bridge Matching

Jeongwoo Shin, Jinhwan Sul, Joonseok Lee, Jaewong Choi, Jaemoo Choi

TL;DR

This work addresses inefficiencies in diffusion-based generative modeling by moving beyond memoryless forward dynamics. It introduces Adjoint Schrödinger Bridge Matching (ASBM), a two-stage approach that first learns an informative forward coupling via data-to-energy sampling and then optimizes the backward generative dynamic with a simple bridge-matching loss under the learned coupling, yielding significantly straighter trajectories and fewer function evaluations. By leveraging a non-memoryless base SDE, ASBM achieves improved stability, scalability to high-dimensional data, and strong performance in image generation, with additional benefits demonstrated through distillation to a one-step generator. The results show superior FID and mode coverage on CIFAR-10 and FFHQ, faster convergence, and notable robustness across ablations, indicating practical impact for efficient diffusion-like generative modeling.

Abstract

Diffusion models often yield highly curved trajectories and noisy score targets due to an uninformative, memoryless forward process that induces independent data-noise coupling. We propose Adjoint Schrödinger Bridge Matching (ASBM), a generative modeling framework that recovers optimal trajectories in high dimensions via two stages. First, we view the Schrödinger Bridge (SB) forward dynamic as a coupling construction problem and learn it through a data-to-energy sampling perspective that transports data to an energy-defined prior. Then, we learn the backward generative dynamic with a simple matching loss supervised by the induced optimal coupling. By operating in a non-memoryless regime, ASBM produces significantly straighter and more efficient sampling paths. Compared to prior works, ASBM scales to high-dimensional data with notably improved stability and efficiency. Extensive experiments on image generation show that ASBM improves fidelity with fewer sampling steps. We further showcase the effectiveness of our optimal trajectory via distillation to a one-step generator.

Efficient Generative Modeling beyond Memoryless Diffusion via Adjoint Schrödinger Bridge Matching

TL;DR

This work addresses inefficiencies in diffusion-based generative modeling by moving beyond memoryless forward dynamics. It introduces Adjoint Schrödinger Bridge Matching (ASBM), a two-stage approach that first learns an informative forward coupling via data-to-energy sampling and then optimizes the backward generative dynamic with a simple bridge-matching loss under the learned coupling, yielding significantly straighter trajectories and fewer function evaluations. By leveraging a non-memoryless base SDE, ASBM achieves improved stability, scalability to high-dimensional data, and strong performance in image generation, with additional benefits demonstrated through distillation to a one-step generator. The results show superior FID and mode coverage on CIFAR-10 and FFHQ, faster convergence, and notable robustness across ablations, indicating practical impact for efficient diffusion-like generative modeling.

Abstract

Diffusion models often yield highly curved trajectories and noisy score targets due to an uninformative, memoryless forward process that induces independent data-noise coupling. We propose Adjoint Schrödinger Bridge Matching (ASBM), a generative modeling framework that recovers optimal trajectories in high dimensions via two stages. First, we view the Schrödinger Bridge (SB) forward dynamic as a coupling construction problem and learn it through a data-to-energy sampling perspective that transports data to an energy-defined prior. Then, we learn the backward generative dynamic with a simple matching loss supervised by the induced optimal coupling. By operating in a non-memoryless regime, ASBM produces significantly straighter and more efficient sampling paths. Compared to prior works, ASBM scales to high-dimensional data with notably improved stability and efficiency. Extensive experiments on image generation show that ASBM improves fidelity with fewer sampling steps. We further showcase the effectiveness of our optimal trajectory via distillation to a one-step generator.
Paper Structure (18 sections, 3 theorems, 46 equations, 8 figures, 5 tables, 1 algorithm)

This paper contains 18 sections, 3 theorems, 46 equations, 8 figures, 5 tables, 1 algorithm.

Key Result

Proposition 3.1

If the base path measure $p^{\text{base}}$ is memoryless, then the optimal path measure $p^\star$ of the SB problem is also memoryless, i.e.,

Figures (8)

  • Figure 1: Generation trajectory of score matching and ASBM.Top: Backward drift accumulated over time in pixel level. Bottom: Denoising path in image level. ASBM shows significantly smaller transport cost with straighter path, leading to efficient generation.
  • Figure 2: Generated samples from ASBM on pixel space (CIFAR-10) and on latent space (FFHQ).
  • Figure 3: FID comparison along the NFE. We use M and NM to denote the memoryless and non-memoryless condition, respectively. BM denotes empirical bridge-matching pretraining.
  • Figure 4: Trajectory efficiency.Left: Ours shows significantly straighter trajectory, leading to low NFE at generation. Right: Ours has lower trajectory variance, implying its better organized path.
  • Figure 5: Localized prior-data coupling. ASBM trajectories preserve information: reversing from a noised image produces samples similar to the original. In contrast, memoryless dynamics yield completely random samples due to highly noisy trajectories.
  • ...and 3 more figures

Theorems & Definitions (4)

  • Proposition 3.1
  • Theorem 1.1
  • proof
  • Theorem 2.1