Deep Generative Learning via Schrödinger Bridge
Gefei Wang, Yuling Jiao, Qian Xu, Yang Wang, Can Yang
TL;DR
This work introduces a Schrödinger Bridge–based approach to deep generative learning by formulating distribution learning as entropy interpolation between a reference and a target distribution on a unit interval with a time-varying drift SDE. It implements a two-stage sampling procedure, backed by deep density-ratio and score estimators, and proves consistency under mild smoothness assumptions. The method yields a strong theoretical foundation without requiring log-concavity, and empirical results show multimodal distribution recovery and competitive image generation on CIFAR-10 and CelebA, along with effective image interpolation and inpainting. Overall, the study offers a novel, theoretically grounded alternative to GAN-style generative models with practical capabilities for high-fidelity synthesis and image editing tasks.
Abstract
We propose to learn a generative model via entropy interpolation with a Schrödinger Bridge. The generative learning task can be formulated as interpolating between a reference distribution and a target distribution based on the Kullback-Leibler divergence. At the population level, this entropy interpolation is characterized via an SDE on $[0,1]$ with a time-varying drift term. At the sample level, we derive our Schrödinger Bridge algorithm by plugging the drift term estimated by a deep score estimator and a deep density ratio estimator into the Euler-Maruyama method. Under some mild smoothness assumptions of the target distribution, we prove the consistency of both the score estimator and the density ratio estimator, and then establish the consistency of the proposed Schrödinger Bridge approach. Our theoretical results guarantee that the distribution learned by our approach converges to the target distribution. Experimental results on multimodal synthetic data and benchmark data support our theoretical findings and indicate that the generative model via Schrödinger Bridge is comparable with state-of-the-art GANs, suggesting a new formulation of generative learning. We demonstrate its usefulness in image interpolation and image inpainting.
