Table of Contents
Fetching ...

Deep Generative Learning via Schrödinger Bridge

Gefei Wang, Yuling Jiao, Qian Xu, Yang Wang, Can Yang

TL;DR

This work introduces a Schrödinger Bridge–based approach to deep generative learning by formulating distribution learning as entropy interpolation between a reference and a target distribution on a unit interval with a time-varying drift SDE. It implements a two-stage sampling procedure, backed by deep density-ratio and score estimators, and proves consistency under mild smoothness assumptions. The method yields a strong theoretical foundation without requiring log-concavity, and empirical results show multimodal distribution recovery and competitive image generation on CIFAR-10 and CelebA, along with effective image interpolation and inpainting. Overall, the study offers a novel, theoretically grounded alternative to GAN-style generative models with practical capabilities for high-fidelity synthesis and image editing tasks.

Abstract

We propose to learn a generative model via entropy interpolation with a Schrödinger Bridge. The generative learning task can be formulated as interpolating between a reference distribution and a target distribution based on the Kullback-Leibler divergence. At the population level, this entropy interpolation is characterized via an SDE on $[0,1]$ with a time-varying drift term. At the sample level, we derive our Schrödinger Bridge algorithm by plugging the drift term estimated by a deep score estimator and a deep density ratio estimator into the Euler-Maruyama method. Under some mild smoothness assumptions of the target distribution, we prove the consistency of both the score estimator and the density ratio estimator, and then establish the consistency of the proposed Schrödinger Bridge approach. Our theoretical results guarantee that the distribution learned by our approach converges to the target distribution. Experimental results on multimodal synthetic data and benchmark data support our theoretical findings and indicate that the generative model via Schrödinger Bridge is comparable with state-of-the-art GANs, suggesting a new formulation of generative learning. We demonstrate its usefulness in image interpolation and image inpainting.

Deep Generative Learning via Schrödinger Bridge

TL;DR

This work introduces a Schrödinger Bridge–based approach to deep generative learning by formulating distribution learning as entropy interpolation between a reference and a target distribution on a unit interval with a time-varying drift SDE. It implements a two-stage sampling procedure, backed by deep density-ratio and score estimators, and proves consistency under mild smoothness assumptions. The method yields a strong theoretical foundation without requiring log-concavity, and empirical results show multimodal distribution recovery and competitive image generation on CIFAR-10 and CelebA, along with effective image interpolation and inpainting. Overall, the study offers a novel, theoretically grounded alternative to GAN-style generative models with practical capabilities for high-fidelity synthesis and image editing tasks.

Abstract

We propose to learn a generative model via entropy interpolation with a Schrödinger Bridge. The generative learning task can be formulated as interpolating between a reference distribution and a target distribution based on the Kullback-Leibler divergence. At the population level, this entropy interpolation is characterized via an SDE on with a time-varying drift term. At the sample level, we derive our Schrödinger Bridge algorithm by plugging the drift term estimated by a deep score estimator and a deep density ratio estimator into the Euler-Maruyama method. Under some mild smoothness assumptions of the target distribution, we prove the consistency of both the score estimator and the density ratio estimator, and then establish the consistency of the proposed Schrödinger Bridge approach. Our theoretical results guarantee that the distribution learned by our approach converges to the target distribution. Experimental results on multimodal synthetic data and benchmark data support our theoretical findings and indicate that the generative model via Schrödinger Bridge is comparable with state-of-the-art GANs, suggesting a new formulation of generative learning. We demonstrate its usefulness in image interpolation and image inpainting.

Paper Structure

This paper contains 34 sections, 14 theorems, 110 equations, 9 figures, 8 tables, 2 algorithms.

Key Result

Theorem 1

leonard2014survey If $\mu, \nu \ll \mathscr{L}$, then SBP admits a unique solution $\mathbf{Q}^* = f^*(X_0)g^*(X_1)\mathbf{P}_{\tau}$, where $f^*$, $g^*$ are $\mathscr{L}$-measurable nonnegative functions on $\mathbb{R}^d$ satisfying the Schrödinger system $\left\{\right.$

Figures (9)

  • Figure 1: Overview of our two-stage algorithm. Stage 1 drives samples at $\mathbf{0}$ (left) to a smoothed data distribution (middle), and stage 2 learns the underlying target data distribution (right) with samples produced by stage 1. Stage 1 and stage 2 are achieved through the two different Schrodinger Bridges with theoretically guaranteed performance.
  • Figure 2: KDE plots for mixture of Gaussians with 5,000 samples. (a). Ground truth. (b). Distribution learned by vanilla GAN. (c). Distribution learned by the proposed method after stage 1 ($\tau=5.0$). (d). Distribution learned by the proposed method after stage 2.
  • Figure 3: Velocity fields. (a) and (b). Ground truth velocity fields at the end of stages 1 and 2. (c) and (d). Estimated velocity fields at the end of stages 1 and 2.
  • Figure 4: Particle evolution on CIFAR-10. The column in the center indicates particles obtained after stage 1.
  • Figure 5: Comparison with random image samples. (a). Samples produced by our algorithm with $\tau = 2.0$ (FID = 12.32). (b), (c), (d). Samples produced by stage 2 taking Gaussian noises with variance $1.0$ (FID = 32.60), $1.5$ (FID = 24.76), $2.0$ (FID = 51.21) as input respectively.
  • ...and 4 more figures

Theorems & Definitions (14)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • Theorem 7
  • Theorem 8
  • Theorem 9
  • Theorem 10
  • ...and 4 more