Stable generative modeling using Schrödinger bridges

Georg A. Gottwald; Fengyi Li; Youssef Marzouk; Sebastian Reich

Stable generative modeling using Schrödinger bridges

Georg A. Gottwald, Fengyi Li, Youssef Marzouk, Sebastian Reich

TL;DR

This paper proposes a generative model combining Schrodinger bridges and Langevin dynamics, and introduces a novel split-step scheme, ensuring that the generated samples remain within the convex hull of the training samples.

Abstract

We consider the problem of sampling from an unknown distribution for which only a sufficiently large number of training samples are available. Such settings have recently drawn considerable interest in the context of generative modelling and Bayesian inference. In this paper, we propose a generative model combining Schrödinger bridges and Langevin dynamics. Schrödinger bridges over an appropriate reversible reference process are used to approximate the conditional transition probability from the available training samples, which is then implemented in a discrete-time reversible Langevin sampler to generate new samples. By setting the kernel bandwidth in the reference process to match the time step size used in the unadjusted Langevin algorithm, our method effectively circumvents any stability issues typically associated with the time-stepping of stiff stochastic differential equations. Moreover, we introduce a novel split-step scheme, ensuring that the generated samples remain within the convex hull of the training samples. Our framework can be naturally extended to generate conditional samples and to Bayesian inference problems. We demonstrate the performance of our proposed scheme through experiments on synthetic datasets with increasing dimensions and on a stochastic subgrid-scale parametrization conditional sampling problem as well as generating sample trajectories of a dynamical system using conditional sampling.

Stable generative modeling using Schrödinger bridges

TL;DR

Abstract

Paper Structure (18 sections, 3 theorems, 61 equations, 11 figures, 1 table)

This paper contains 18 sections, 3 theorems, 61 equations, 11 figures, 1 table.

Introduction
Related work
Outline
Discrete Schrödinger bridges
Approximating the conditional mean
Sampling algorithms
Langevin sampler with data-unaware diffusion
Langevin sampler with data-aware diffusion
Algorithmic properties
Variable bandwidth diffusion
Bayesian inference and conditional sampling
Numerical experiments
One-dimensional manifold
Multi-dimensional manifolds
Stochastic subgrid-scale parametrization
...and 3 more sections

Key Result

Lemma 3.1

Let us denote the convex hull generated by the data points $\{ x^{(i)}\}_{i=1}^M$ by $\mathcal{C}_M$. It holds that for all choices of $\epsilon > 0$ and all $x \in \mathbb{R}^d$.

Figures (11)

Figure 1: Histograms of the $x_1$- and $x_2$-components of the training as well as generated data for Example \ref{['ex:2Db']}. One finds that the split-step scheme effectively denoises the $x_2$-component while faithfully reproducing the standard normal distribution in the $x_1$-component.
Figure 2: Comparison of the different noise models employed by the generative model. We employed a constant bandwidth with $\epsilon=0.009$. Left: Original (blue) and generated data using a constant covariance (red) and the sample covariance $C(x)$ (magenta). Middle: Empirical histograms of the angular variable $\theta$. Right: Empirical histograms of the radial variable $r$.
Figure 3: Effect of a variable bandwidth $K(x) = \rho(x)I$ in data-sparse regions. For the generative model the Langevin sampler \ref{['eq:update2_ss']} is used and we set $\epsilon=0.009$. Results are shown for the output of step \ref{['eq:update2_ss_b']}. Left: Original (blue) and generated data for a constant bandwidth $K(x) = I$ (red). Right: Original (blue) and generated data for a variable bandwidth $K(x) = \rho(x)I$ with $\rho(x)=\pi(x)^\beta$ with $\beta=-1/5$ (magenta).
Figure 4: Effect of a variable bandwidth $K(x) = \rho(x)I$ on the angular and radial distributions (left and right, respectively). Shown are the original data (blue), generated data for a constant bandwidth $K(x) = I$ (red) and for a variable bandwidth $K(x) = \rho(x)I$ with $\rho(x)=\pi(x)^\beta$ with $\beta=-1/5$. The data were generated using a constant covariance noise model in \ref{['eq:update2_ss']} and $\epsilon=0.009$.
Figure 5: Effect of a variable time step $\Delta \tau$ in the Langevin sampler \ref{['eq:update_ss']} with constant diffusion $K=1$. Results are shown for the original data, and for $\Delta \tau=\epsilon$ and $\Delta \tau=\epsilon/4$. Throughout a constant bandwidth is used. Left: Empirical histogram of the angular variable $\theta$. Middle: Empirical histograms of the radial variable $r$. Right: Original (blue) and generated data in the $(x_1,x_2)$-plane with $\Delta \tau=1$ (red) and with $\Delta \tau=\epsilon/4$ (green).
...and 6 more figures

Theorems & Definitions (8)

Remark 3.1
Lemma 3.1
proof
Lemma 3.2
proof
Lemma 3.3
proof
Remark 4.1

Stable generative modeling using Schrödinger bridges

TL;DR

Abstract

Stable generative modeling using Schrödinger bridges

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (8)