Posterior Sampling in High Dimension via Diffusion Processes
Andrea Montanari, Yuchen Wu
TL;DR
This work introduces a diffusion-based paradigm for posterior sampling in high dimensions, constructing a non-homogeneous diffusion whose drift is tied to the posterior mean and discretizing it via Euler steps. The authors fuse AMP and TAP-based variational ideas to approximate the posterior mean, enabling efficient sampling in canonical problems like spiked matrix models and high-dimensional linear regression. They establish theoretical Wasserstein guarantees under regularity and Lipschitz conditions and demonstrate practical algorithms with scalable complexity, supported by simulations that align with AMP predictions. The approach bridges stochastic localization, denoising diffusion concepts, and variational inference to deliver provable, scalable posterior sampling for challenging statistical models.
Abstract
Sampling from the posterior is a key technical problem in Bayesian statistics. Rigorous guarantees are difficult to obtain for Markov Chain Monte Carlo algorithms of common use. In this paper, we study an alternative class of algorithms based on diffusion processes and variational methods. The diffusion is constructed in such a way that, at its final time, it approximates the target posterior distribution. The drift of this diffusion is given by the posterior expectation of the unknown parameter vector ${\boldsymbol θ}$ given the data and the additional noisy observations. In order to construct an efficient sampling algorithm, we use a simple Euler discretization of the diffusion process, and leverage message passing algorithms and variational inference techniques to approximate the posterior expectation oracle. We apply this method to posterior sampling in two canonical problems in high-dimensional statistics: sparse regression and low-rank matrix estimation within the spiked model. In both cases we develop the first algorithms with accuracy guarantees in the regime of constant signal-to-noise ratios.
