Sampling, Diffusions, and Stochastic Localization

Andrea Montanari

Sampling, Diffusions, and Stochastic Localization

Andrea Montanari

TL;DR

This work presents a unified framework that connects diffusion-based sampling with stochastic localization, showing how a broad class of observation-driven processes (Y_t) yields martingale posterior measures μ_t that converge to the target μ. By recasting reverse-diffusion sampling as stochastic localization, it derives many concrete schemes (isotropic/anisotropic Gaussian, erasure, binary/symmetric, linear, information percolation, Poisson, half-space, and Euclidean-invariant variants) and clarifies how the choice of observation process and neural denoisers shapes efficiency and accuracy. It also explores the interaction between sampling schemes and data architecture, illustrating how to tailor schemes to capture long-range correlations in images and how symmetry considerations guide kernel design. The paper discusses practical aspects such as approximating transition probabilities, learning from samples, preserving problem symmetries, avoiding problematic regions or phase transitions, and incorporating latent variables, offering a roadmap for building scalable, structure-aware diffusion-based samplers. Overall, it provides a versatile, unified lens for designing, analyzing, and applying diffusion and localization-based sampling methods across continuous and discrete settings.

Abstract

Diffusions are a successful technique to sample from high-dimensional distributions. The target distribution can be either explicitly given or learnt from a collection of samples. They implement a diffusion process whose endpoint is a sample from the target distribution. The drift of the diffusion process is typically represented as a neural network. Stochastic localization is a successful technique to prove mixing of Markov Chains and other functional inequalities in high dimension. An algorithmic version of stochastic localization was recently proposed in order to sample from certain statistical mechanics models. This expository article has three objectives: $(i)$~Generalize the algorithmic construction to other stochastic localization processes. This construction is both simple and broadly applicable; $(ii)$~Clarify the connection between diffusions and stochastic localization. This allows to derive several known sampling schemes in a unified fashion; $(iii)$~Describe the insights that follow from this unified viewpoint.

Sampling, Diffusions, and Stochastic Localization

TL;DR

Abstract

~Generalize the algorithmic construction to other stochastic localization processes. This construction is both simple and broadly applicable;

~Clarify the connection between diffusions and stochastic localization. This allows to derive several known sampling schemes in a unified fashion;

~Describe the insights that follow from this unified viewpoint.

Paper Structure (45 sections, 4 theorems, 130 equations, 4 figures, 3 algorithms)

This paper contains 45 sections, 4 theorems, 130 equations, 4 figures, 3 algorithms.

Introduction
Sampling
Diffusions
A special stochastic localization process
Literature overview
General stochastic localization sampling
A dozen examples of sampling schemes
The isotropic Gaussian process
The anisotropic Gaussian process
The erasure process
The binary symmetric process
The symmetric process
The linear observation process
The information percolation process
The Poisson observation process
...and 30 more sections

Key Result

Proposition 1.1

Assume $\mu$ has finite second moment. Then, $(\boldsymbol{Y}_{t})_{t\ge 0}$ is the unique solution of the following stochastic differential equation (with initial condition $\boldsymbol{Y}_0=\boldsymbol{0}$) Here $(\boldsymbol{B}_{t})_{t\ge 0}$ is a standard Brownian motion and $\bm({\boldsymbol{y}};t)$ is the conditional expectation defined in Eq. eq:Denoiser, i.e.

Figures (4)

Figure 1: Generating from a mixture of two Gaussians in $n=128$ dimensions using isotropic diffusions We compare the empirical distribution of the projection along the direction of the means difference, with the correct distribution. Each row corresponds to a different model for the posterior mean, and each column to a different time in the generation process.
Figure 2: Generating from a mixture of two Gaussians in $n=128$ dimensions. The setting and network architecture are the same as in Fig. \ref{['fig:Mixture_Isotropic']}, although the generating process is different. Alongside the original data perturbed by Gaussian noise, we reveal $\langle{\boldsymbol{v}},{\boldsymbol{x}}\rangle$ for a fixed vector ${\boldsymbol{v}}$.
Figure 3: Sample images from the synthetic distribution used in for experiments in Section \ref{['sec:ToyNum']}. See Appendix \ref{['sec:ToyDistribution']} for a full definition.
Figure 4: Learning the distribution of Figure \ref{['fig:Samples']} and sampling via stochastic localization. Upper block: Standard isotropic diffusion. Lower block: A linear observation process. Each row corresponds to an independent realization of the generating process, with time progressing from left to right.

Theorems & Definitions (13)

Proposition 1.1
Definition 3.1
Remark 3.1: Relation to earlier definitions, I
Remark 3.2: Relation to earlier definitions, II
Remark 3.3: Completeness
Remark 3.4
Remark 4.1
Remark 4.2
Remark 4.3
Proposition 7.1
...and 3 more

Sampling, Diffusions, and Stochastic Localization

TL;DR

Abstract

Sampling, Diffusions, and Stochastic Localization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (13)