Table of Contents
Fetching ...

Efficient Approximate Posterior Sampling with Annealed Langevin Monte Carlo

Advait Parulekar, Litu Rout, Karthikeyan Shanmugam, Sanjay Shakkottai

TL;DR

This paper tackles the challenge of posterior sampling for score-based priors under a measurement model by introducing Annealed Langevin Monte Carlo (ALMC). Instead of chasing exact KL-posterior sampling, it constructs a time-varying path of posteriors μ_t for a noised prior and uses warm-started Langevin dynamics followed by annealing to track this path. The authors prove polynomial-time guarantees: early stopping yields simultaneous KL proximity to the annealed posterior and FI proximity to the true posterior, under minimal assumptions (sub-Gaussian priors, Lipschitz scores, convex smooth likelihood). They further show that combining KL and FI guarantees avoids mode-collapse in multimodal settings, providing both global and local correctness for approximate posterior sampling. The results offer a principled, tractable framework for posterior inference with score-based models and suggest avenues for extensions to other posterior-sampling paradigms.

Abstract

We study the problem of posterior sampling in the context of score based generative models. We have a trained score network for a prior $p(x)$, a measurement model $p(y|x)$, and are tasked with sampling from the posterior $p(x|y)$. Prior work has shown this to be intractable in KL (in the worst case) under well-accepted computational hardness assumptions. Despite this, popular algorithms for tasks such as image super-resolution, stylization, and reconstruction enjoy empirical success. Rather than establishing distributional assumptions or restricted settings under which exact posterior sampling is tractable, we view this as a more general "tilting" problem of biasing a distribution towards a measurement. Under minimal assumptions, we show that one can tractably sample from a distribution that is simultaneously close to the posterior of a noised prior in KL divergence and the true posterior in Fisher divergence. Intuitively, this combination ensures that the resulting sample is consistent with both the measurement and the prior. To the best of our knowledge these are the first formal results for (approximate) posterior sampling in polynomial time.

Efficient Approximate Posterior Sampling with Annealed Langevin Monte Carlo

TL;DR

This paper tackles the challenge of posterior sampling for score-based priors under a measurement model by introducing Annealed Langevin Monte Carlo (ALMC). Instead of chasing exact KL-posterior sampling, it constructs a time-varying path of posteriors μ_t for a noised prior and uses warm-started Langevin dynamics followed by annealing to track this path. The authors prove polynomial-time guarantees: early stopping yields simultaneous KL proximity to the annealed posterior and FI proximity to the true posterior, under minimal assumptions (sub-Gaussian priors, Lipschitz scores, convex smooth likelihood). They further show that combining KL and FI guarantees avoids mode-collapse in multimodal settings, providing both global and local correctness for approximate posterior sampling. The results offer a principled, tractable framework for posterior inference with score-based models and suggest avenues for extensions to other posterior-sampling paradigms.

Abstract

We study the problem of posterior sampling in the context of score based generative models. We have a trained score network for a prior , a measurement model , and are tasked with sampling from the posterior . Prior work has shown this to be intractable in KL (in the worst case) under well-accepted computational hardness assumptions. Despite this, popular algorithms for tasks such as image super-resolution, stylization, and reconstruction enjoy empirical success. Rather than establishing distributional assumptions or restricted settings under which exact posterior sampling is tractable, we view this as a more general "tilting" problem of biasing a distribution towards a measurement. Under minimal assumptions, we show that one can tractably sample from a distribution that is simultaneously close to the posterior of a noised prior in KL divergence and the true posterior in Fisher divergence. Intuitively, this combination ensures that the resulting sample is consistent with both the measurement and the prior. To the best of our knowledge these are the first formal results for (approximate) posterior sampling in polynomial time.

Paper Structure

This paper contains 16 sections, 28 theorems, 108 equations, 7 figures, 1 algorithm.

Key Result

Lemma 4.2

Take $T = \mathcal{O}(\frac{d}{\epsilon^2}\log\frac{\mathsf{KL}\left(\gamma\Vert \mu_\infty\right)}{\epsilon})$ and $T_{ws} = \mathcal{O}\left(\log \frac{d}{\epsilon}\right)$. The Warm Start phase of Algorithm alg:AULA results in a sample $X_{T}$ satisfying $\mathsf{KL}\left(\mu_{T_{ws}}\Vert \text{

Figures (7)

  • Figure 1: (1.) We sample using LMC from $\mu_T\approx\mu_\infty$. (2.) We run \ref{['eq:AnnealedLMC']} along the path $t\mapsto\mu_t$.
  • Figure 2: Three priors used in our experiments. A Mixture-of-Gaussians prior on the left, a "Vertical Bars" prior in the center (similar to Remark \ref{['rem:synthetic']}), and a "moons" prior on the right.
  • Figure 3: Likelihood functions used to define the posterior. $R(x) = \mathfrak{R}\Vert Ax\Vert^2$ where $A = 1000$. Essentially these are "noisy projections", somewhat analogous to an inpainting problem (one coordinate is seen, the other is not).
  • Figure 4: Resulting sampler, run with $\kappa = 400$. Shown are hex-jointplots of 10000 samples each.
  • Figure 5: Likelihood functions used to define the posterior, corresponding to a noisy gaussian measurement $R(x) = \mathfrak{R}\Vert x\Vert^2$.
  • ...and 2 more figures

Theorems & Definitions (54)

  • Remark 2.1: Action
  • Remark 2.2: Annealing
  • Remark 4.2
  • Lemma 4.2
  • proof : Proof Sketch
  • Theorem 4.3
  • proof : Proof Sketch
  • Theorem 4.4
  • proof : Proof Sketch
  • Corollary 4.1: KL + FI
  • ...and 44 more