Table of Contents
Fetching ...

Variational Schrödinger Diffusion Models

Wei Deng, Weijian Luo, Yixin Tan, Marin Biloš, Yu Chen, Yuriy Nevmyvaka, Ricky T. Q. Chen

TL;DR

<3-5 sentence high-level summary> The paper tackles the scalability bottleneck of Schrödinger-bridge diffusion by replacing intractable forward scores with variational (linear) forward scores, enabling simulation-free training of backward scores in a multivariate diffusion. It develops the variational Schrödinger diffusion model (VSDM), derives closed-form backward-score expressions under linear approximations, and introduces stochastic-approximation-based adaptive optimization of the forward-score matrix ${\bf A}_t$. The authors prove convergence results for the adaptive scores and bound the variational transport gap, while empirically showing VSDM yields straighter transport trajectories, competitive CIFAR10 generation, and effective time-series conditioning without warm-up initializations. The work thus provides a scalable, tuning-friendly framework that preserves key OT properties and broad applicability across image, synthetic geometry, and sequence modeling tasks.

Abstract

Schrödinger bridge (SB) has emerged as the go-to method for optimizing transportation plans in diffusion models. However, SB requires estimating the intractable forward score functions, inevitably resulting in the costly implicit training loss based on simulated trajectories. To improve the scalability while preserving efficient transportation plans, we leverage variational inference to linearize the forward score functions (variational scores) of SB and restore simulation-free properties in training backward scores. We propose the variational Schrödinger diffusion model (VSDM), where the forward process is a multivariate diffusion and the variational scores are adaptively optimized for efficient transport. Theoretically, we use stochastic approximation to prove the convergence of the variational scores and show the convergence of the adaptively generated samples based on the optimal variational scores. Empirically, we test the algorithm in simulated examples and observe that VSDM is efficient in generations of anisotropic shapes and yields straighter sample trajectories compared to the single-variate diffusion. We also verify the scalability of the algorithm in real-world data and achieve competitive unconditional generation performance in CIFAR10 and conditional generation in time series modeling. Notably, VSDM no longer depends on warm-up initializations and has become tuning-friendly in training large-scale experiments.

Variational Schrödinger Diffusion Models

TL;DR

<3-5 sentence high-level summary> The paper tackles the scalability bottleneck of Schrödinger-bridge diffusion by replacing intractable forward scores with variational (linear) forward scores, enabling simulation-free training of backward scores in a multivariate diffusion. It develops the variational Schrödinger diffusion model (VSDM), derives closed-form backward-score expressions under linear approximations, and introduces stochastic-approximation-based adaptive optimization of the forward-score matrix . The authors prove convergence results for the adaptive scores and bound the variational transport gap, while empirically showing VSDM yields straighter transport trajectories, competitive CIFAR10 generation, and effective time-series conditioning without warm-up initializations. The work thus provides a scalable, tuning-friendly framework that preserves key OT properties and broad applicability across image, synthetic geometry, and sequence modeling tasks.

Abstract

Schrödinger bridge (SB) has emerged as the go-to method for optimizing transportation plans in diffusion models. However, SB requires estimating the intractable forward score functions, inevitably resulting in the costly implicit training loss based on simulated trajectories. To improve the scalability while preserving efficient transportation plans, we leverage variational inference to linearize the forward score functions (variational scores) of SB and restore simulation-free properties in training backward scores. We propose the variational Schrödinger diffusion model (VSDM), where the forward process is a multivariate diffusion and the variational scores are adaptively optimized for efficient transport. Theoretically, we use stochastic approximation to prove the convergence of the variational scores and show the convergence of the adaptively generated samples based on the optimal variational scores. Empirically, we test the algorithm in simulated examples and observe that VSDM is efficient in generations of anisotropic shapes and yields straighter sample trajectories compared to the single-variate diffusion. We also verify the scalability of the algorithm in real-world data and achieve competitive unconditional generation performance in CIFAR10 and conditional generation in time series modeling. Notably, VSDM no longer depends on warm-up initializations and has become tuning-friendly in training large-scale experiments.
Paper Structure (65 sections, 9 theorems, 92 equations, 12 figures, 4 tables)

This paper contains 65 sections, 9 theorems, 92 equations, 12 figures, 4 tables.

Key Result

Proposition 1

Assume Lipschitz smoothness and linear growth condition on the drift ${\boldsymbol f}$ and diffusion $g$ in the FB-SDE FB-SDE. Define $\overrightarrow y_t = \log \overrightarrow\psi_t({\bf x}_t)$ and $\overleftarrow y_t=\log \overleftarrow\varphi_t({\bf x}_t)$. Then the stochastic representation fol where ${\overrightarrow{\bf z}_t =\sqrt{\beta_t} \nabla \overrightarrow y_t}$, ${\overleftarrow {\b

Figures (12)

  • Figure 1: Gaussian SB (GSB) v.s. VSDM on the flow trajectories.
  • Figure 2: Variational Schrödinger diffusion models (VSDMs, bottom) v.s. SGMs (top) with the same hyperparameters ($\beta_{\max}=10$).
  • Figure 3: Probability flow ODE via VSDMs and SGMs. SGM with $\beta_{\max}=10$ is denoted by SGM-10 for convenience.
  • Figure 4: Unconditional generated samples from VSDM on CIFAR10 (32$\times 32$ resolution) trained from scratch.
  • Figure 5: Example for Electricity for 2 (out of 370) dimensions.
  • ...and 7 more figures

Theorems & Definitions (17)

  • Proposition 1: Nonlinear Feynman-Kac representation
  • Corollary 1
  • proof
  • Theorem 1: Generation quality
  • proof
  • Lemma 1: Lower bound of the log-Sobolev constant
  • proof
  • Lemma 2: Local stabiltity
  • proof
  • Lemma 3: Linear growth
  • ...and 7 more