Efficient Generative Modeling beyond Memoryless Diffusion via Adjoint Schrödinger Bridge Matching
Jeongwoo Shin, Jinhwan Sul, Joonseok Lee, Jaewong Choi, Jaemoo Choi
TL;DR
This work addresses inefficiencies in diffusion-based generative modeling by moving beyond memoryless forward dynamics. It introduces Adjoint Schrödinger Bridge Matching (ASBM), a two-stage approach that first learns an informative forward coupling via data-to-energy sampling and then optimizes the backward generative dynamic with a simple bridge-matching loss under the learned coupling, yielding significantly straighter trajectories and fewer function evaluations. By leveraging a non-memoryless base SDE, ASBM achieves improved stability, scalability to high-dimensional data, and strong performance in image generation, with additional benefits demonstrated through distillation to a one-step generator. The results show superior FID and mode coverage on CIFAR-10 and FFHQ, faster convergence, and notable robustness across ablations, indicating practical impact for efficient diffusion-like generative modeling.
Abstract
Diffusion models often yield highly curved trajectories and noisy score targets due to an uninformative, memoryless forward process that induces independent data-noise coupling. We propose Adjoint Schrödinger Bridge Matching (ASBM), a generative modeling framework that recovers optimal trajectories in high dimensions via two stages. First, we view the Schrödinger Bridge (SB) forward dynamic as a coupling construction problem and learn it through a data-to-energy sampling perspective that transports data to an energy-defined prior. Then, we learn the backward generative dynamic with a simple matching loss supervised by the induced optimal coupling. By operating in a non-memoryless regime, ASBM produces significantly straighter and more efficient sampling paths. Compared to prior works, ASBM scales to high-dimensional data with notably improved stability and efficiency. Extensive experiments on image generation show that ASBM improves fidelity with fewer sampling steps. We further showcase the effectiveness of our optimal trajectory via distillation to a one-step generator.
