Localized Schrödinger Bridge Sampler

Georg A. Gottwald; Sebastian Reich

Localized Schrödinger Bridge Sampler

Georg A. Gottwald, Sebastian Reich

Abstract

We consider the problem of sampling from an unknown distribution for which only a sufficiently large number of training samples are available. In this paper, we build on previous work combining Schrödinger bridges and plug & play Langevin samplers. A key bottleneck of these approaches is the exponential dependence of the required training samples on the dimension, $d$, of the ambient state space. We propose a localization strategy which exploits conditional independence of conditional expectation values. Localization thus replaces a single high-dimensional Schrödinger bridge problem by $d$ low-dimensional Schrödinger bridge problems over the available training samples. In this context, a connection to multi-head self attention transformer architectures is established. As for the original Schrödinger bridge sampling approach, the localized sampler is stable and geometric ergodic. The sampler also naturally extends to conditional sampling and to Bayesian inference. We demonstrate the performance of our proposed scheme through experiments on a high-dimensional Gaussian problem, on a temporal stochastic process, and on a stochastic subgrid-scale parametrization conditional sampling problem. We also extend the idea of localization to plug & play Langevin samplers using kernel-based denoising in combination with Tweedie's formula.

Localized Schrödinger Bridge Sampler

Abstract

, of the ambient state space. We propose a localization strategy which exploits conditional independence of conditional expectation values. Localization thus replaces a single high-dimensional Schrödinger bridge problem by

low-dimensional Schrödinger bridge problems over the available training samples. In this context, a connection to multi-head self attention transformer architectures is established. As for the original Schrödinger bridge sampling approach, the localized sampler is stable and geometric ergodic. The sampler also naturally extends to conditional sampling and to Bayesian inference. We demonstrate the performance of our proposed scheme through experiments on a high-dimensional Gaussian problem, on a temporal stochastic process, and on a stochastic subgrid-scale parametrization conditional sampling problem. We also extend the idea of localization to plug & play Langevin samplers using kernel-based denoising in combination with Tweedie's formula.

Paper Structure (15 sections, 2 theorems, 85 equations, 6 figures, 1 algorithm)

This paper contains 15 sections, 2 theorems, 85 equations, 6 figures, 1 algorithm.

Introduction
Plug & play Langevin sampler
Schrödinger bridge sampler
Kernel denoising and Tweedie's formula
Localized Schrödinger bridge sampler
Motivational example: Gaussian setting
Numerical illustration
Localized Schrödinger bridge sampler for general measures
Algorithmic properties
Localized kernel-denoising
Localised Schrödinger bridge sampler for temporal stochastic processes
Conditional localized Schrödinger bridge sampler
Conditional sampling for a closure problem
Conclusions
Acknowledgements.

Key Result

Lemma 1

Let us introduce the set $\mathcal{C}_M\subset \mathbb{R}^d$ defined by It holds that the vector $m_{\rm loc}(x;\epsilon) \in \mathbb{R}^d$ of localized expectation value satisfies for all choices of $\epsilon > 0$ and all $x \in \mathbb{R}^d$.

Figures (6)

Figure 1: Comparison of the samples obtained from three different variants of localized Schrödinger bridge samplers. We show the centered rows of the empirical covariance matrix $\hat{C}$ (top row) and empirical histograms (bottom row) obtained from using all $d=101$ components. The blue markers denote the empirical covariance for the given samples; the magenta markers show the average over all $d$ rows. Left column: Localized EM-type sampler \ref{['eq:propagator_general_loc']}; middle column: Localized split-step sampler \ref{['eq:update_ss_loc']}; right column: Localized sampler (\ref{['eq:propagator general local']}) with data-aware diffusion matrix \ref{['eq:cm_estimate_local']}. Given the large value of $\epsilon=1$, only (\ref{['eq:propagator general local']}) is able to faithfully reproduce the target measure ${\rm N}(0,C)$.
Figure 2: Comparison of the samples obtained from the localized split-step sampler \ref{['eq:update_ss_loc']} for varying parameter $\epsilon$. Left column: $\epsilon = 1.0$; middle column: $\epsilon = 0.1$; right column: $\epsilon = 0.01$. While $\epsilon = 0.1$ leads to improved results, it is found that larger and smaller values of $\epsilon$ degrade the performance of the split-step sampler.
Figure 3: Generated trajectories for the bimodal SDE \ref{['eq:bistable LD']} using the localized Schrödinger bridge split-step sampler with constant diffusion \ref{['eq:update_ss_loc']}. Left panel: $100$ trajectories out of $M=1,000$ training samples; right panel: $100$ trajectories out of $N=25,000$ generated samples. The computed transition rates (relative number of sign changes along trajectories) agree well with $9\%$ for the training data and $11\%$ for the generated data.
Figure 4: Normalized empirical histograms of training and generated data for the bimodal SDE \ref{['eq:bistable LD']} using the localized Schrödinger bridge split-step sampler with constant diffusion \ref{['eq:update_ss_loc']}. We show results over all $1,000$ training and $25,000$ generated data points. The invariant distribution of the bimodal SDE is well reproduced by the generated data; the dispersion of the generated data in each of its two modes being slightly smaller than the one from the training data, which has also been observed for the split-step scheme in Figure \ref{['fig:multiGauss2']}.
Figure 5: Comparison of the samples obtained from the localized Schrödinger bridge sampler and given samples drawn from the multi-scale Lorenz-96 system \ref{['eq:L96']} using nearest neighbor localization with $\Lambda(\alpha)=\{\alpha-1,\alpha,\alpha+1\, K+\alpha-1,K+\alpha,K+\alpha+1\}$ with the obvious periodic extensions for $\alpha=1$ and $\alpha=d$. We consider $40,000$ new and given samples. Left: Empirical histograms. Right: Scatter plot of the closure term $\psi$ as a function of $z$.
...and 1 more figures

Theorems & Definitions (9)

Remark 1
Remark 2
Remark 3
Remark 4
Lemma 1
proof
Lemma 2
proof
Remark 5

Localized Schrödinger Bridge Sampler

Abstract

Localized Schrödinger Bridge Sampler

Authors

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (9)