Table of Contents
Fetching ...

Posterior Sampling in High Dimension via Diffusion Processes

Andrea Montanari, Yuchen Wu

TL;DR

This work introduces a diffusion-based paradigm for posterior sampling in high dimensions, constructing a non-homogeneous diffusion whose drift is tied to the posterior mean and discretizing it via Euler steps. The authors fuse AMP and TAP-based variational ideas to approximate the posterior mean, enabling efficient sampling in canonical problems like spiked matrix models and high-dimensional linear regression. They establish theoretical Wasserstein guarantees under regularity and Lipschitz conditions and demonstrate practical algorithms with scalable complexity, supported by simulations that align with AMP predictions. The approach bridges stochastic localization, denoising diffusion concepts, and variational inference to deliver provable, scalable posterior sampling for challenging statistical models.

Abstract

Sampling from the posterior is a key technical problem in Bayesian statistics. Rigorous guarantees are difficult to obtain for Markov Chain Monte Carlo algorithms of common use. In this paper, we study an alternative class of algorithms based on diffusion processes and variational methods. The diffusion is constructed in such a way that, at its final time, it approximates the target posterior distribution. The drift of this diffusion is given by the posterior expectation of the unknown parameter vector ${\boldsymbol θ}$ given the data and the additional noisy observations. In order to construct an efficient sampling algorithm, we use a simple Euler discretization of the diffusion process, and leverage message passing algorithms and variational inference techniques to approximate the posterior expectation oracle. We apply this method to posterior sampling in two canonical problems in high-dimensional statistics: sparse regression and low-rank matrix estimation within the spiked model. In both cases we develop the first algorithms with accuracy guarantees in the regime of constant signal-to-noise ratios.

Posterior Sampling in High Dimension via Diffusion Processes

TL;DR

This work introduces a diffusion-based paradigm for posterior sampling in high dimensions, constructing a non-homogeneous diffusion whose drift is tied to the posterior mean and discretizing it via Euler steps. The authors fuse AMP and TAP-based variational ideas to approximate the posterior mean, enabling efficient sampling in canonical problems like spiked matrix models and high-dimensional linear regression. They establish theoretical Wasserstein guarantees under regularity and Lipschitz conditions and demonstrate practical algorithms with scalable complexity, supported by simulations that align with AMP predictions. The approach bridges stochastic localization, denoising diffusion concepts, and variational inference to deliver provable, scalable posterior sampling for challenging statistical models.

Abstract

Sampling from the posterior is a key technical problem in Bayesian statistics. Rigorous guarantees are difficult to obtain for Markov Chain Monte Carlo algorithms of common use. In this paper, we study an alternative class of algorithms based on diffusion processes and variational methods. The diffusion is constructed in such a way that, at its final time, it approximates the target posterior distribution. The drift of this diffusion is given by the posterior expectation of the unknown parameter vector given the data and the additional noisy observations. In order to construct an efficient sampling algorithm, we use a simple Euler discretization of the diffusion process, and leverage message passing algorithms and variational inference techniques to approximate the posterior expectation oracle. We apply this method to posterior sampling in two canonical problems in high-dimensional statistics: sparse regression and low-rank matrix estimation within the spiked model. In both cases we develop the first algorithms with accuracy guarantees in the regime of constant signal-to-noise ratios.
Paper Structure (70 sections, 45 theorems, 280 equations, 3 figures, 6 algorithms)

This paper contains 70 sections, 45 theorems, 280 equations, 3 figures, 6 algorithms.

Key Result

Theorem 1

Assume that $\|\hat{\boldsymbol m}_{\boldsymbol{\theta}}(\boldsymbol{y},T)\|_2 \le {\overline{R}}\sqrt{n}$ for all $\boldsymbol{y} \in \mathbb{R}^N$ (this can always be achieved by projection onto the ball ${\sf B}^n(\boldsymbol{0},{\overline{R}}\sqrt{n})$) and that conditions A1, A2 and A3 hold. Le If in addition $\boldsymbol{H}$ has full column rank, and $\int(\|\boldsymbol{\theta}\|^2_2/n)^{c_0

Figures (3)

  • Figure 1: Trajectories of the first and second coordinates of the estimated mean vectors computed by \ref{['alg:Spiked-Sampling-AMP']}, in the case of $\mathbb{Z}_2$-synchronization. For this experiment we set $n = 1000$, $L = 500$, and $\Delta = 0.02$.
  • Figure 2: Histograms: empirical distributions of $\langle \boldsymbol{\theta}_{\leq 10}, \boldsymbol{\theta}_{\leq 10}^{\hbox{\rm\tiny alg}} \rangle$ for samples generated by \ref{['alg:Spiked-Sampling-AMP']}, based on a single realization of the data $(\boldsymbol{X},\boldsymbol{\theta})$ at each value of $\beta$. Continuous line: theoretical prediction approximating the distribution of $\langle \boldsymbol{\theta}_{\leq 10}, \boldsymbol{\theta}_{\leq 10}^{\hbox{\rm\tiny alg}} \rangle$ with the true posterior.
  • Figure 3: Bands: normalized log-likelihood achieved by \ref{['alg:general-sampling-2']}. Dashed lines: theoretical predictions.

Theorems & Definitions (70)

  • Remark 1.1
  • Remark 1.2: Conjectured regime of validity
  • Theorem 1
  • Theorem 2
  • Lemma 4.1
  • Remark 4.1
  • Theorem 3
  • Remark 4.2
  • Theorem 4
  • Remark 5.1
  • ...and 60 more