Table of Contents
Fetching ...

Posterior Sampling by Combining Diffusion Models with Annealed Langevin Dynamics

Zhiyang Xun, Shivam Gupta, Eric Price

TL;DR

It is proved that combining diffusion models with an annealed variant of Langevin dynamics achieves conditional sampling in polynomial time using merely an $L^4$ bound on the score error.

Abstract

Given a noisy linear measurement $y = Ax + ξ$ of a distribution $p(x)$, and a good approximation to the prior $p(x)$, when can we sample from the posterior $p(x \mid y)$? Posterior sampling provides an accurate and fair framework for tasks such as inpainting, deblurring, and MRI reconstruction, and several heuristics attempt to approximate it. Unfortunately, approximate posterior sampling is computationally intractable in general. To sidestep this hardness, we focus on (local or global) log-concave distributions $p(x)$. In this regime, Langevin dynamics yields posterior samples when the exact scores of $p(x)$ are available, but it is brittle to score--estimation error, requiring an MGF bound (sub-exponential error). By contrast, in the unconditional setting, diffusion models succeed with only an $L^2$ bound on the score error. We prove that combining diffusion models with an annealed variant of Langevin dynamics achieves conditional sampling in polynomial time using merely an $L^4$ bound on the score error.

Posterior Sampling by Combining Diffusion Models with Annealed Langevin Dynamics

TL;DR

It is proved that combining diffusion models with an annealed variant of Langevin dynamics achieves conditional sampling in polynomial time using merely an bound on the score error.

Abstract

Given a noisy linear measurement of a distribution , and a good approximation to the prior , when can we sample from the posterior ? Posterior sampling provides an accurate and fair framework for tasks such as inpainting, deblurring, and MRI reconstruction, and several heuristics attempt to approximate it. Unfortunately, approximate posterior sampling is computationally intractable in general. To sidestep this hardness, we focus on (local or global) log-concave distributions . In this regime, Langevin dynamics yields posterior samples when the exact scores of are available, but it is brittle to score--estimation error, requiring an MGF bound (sub-exponential error). By contrast, in the unconditional setting, diffusion models succeed with only an bound on the score error. We prove that combining diffusion models with an annealed variant of Langevin dynamics achieves conditional sampling in polynomial time using merely an bound on the score error.

Paper Structure

This paper contains 42 sections, 74 theorems, 389 equations, 6 figures, 1 table, 3 algorithms.

Key Result

Theorem 1.1

Let $p(x)$ be an $\alpha$-strongly log-concave distribution over $\mathbb{R}^d$ with $L$-Lipschitz score. For any $0 < \varepsilon < 1$, there exist $K_1 = \mathop{\mathrm{poly}}\limits(d, m, \frac{\|A\|}{\eta\sqrt{\alpha}}, \frac{1}{\varepsilon})$ and $K_2 = \mathop{\mathrm{poly}}\limits(d, m, \fra

Figures (6)

  • Figure 1: A "locally nearly log-concave" distribution suitable for \ref{['thm:gaussian_main']}: uniform on the unit circle plus $\mathcal{N}(0, w^2 I_2)$. The Hessian's largest eigenvalue is much smaller near the bulk of the density than it is globally. Specifically, for $\|A\| w / \eta = O(1)$, a Gaussian measurement $\tilde{x}$ with $\sigma \le cw$ and $\varepsilon_{\text{score}} \le cw^{-1}$ for small enough $c > 0$ enables sampling from $p(x \mid y, \tilde{x})$.
  • Figure 2: \ref{['cor:compressed_sensing']} sampling process. Given the distribution $p(x)$ and measurement $y$, we (1) start with a warm start estimate $x_0$, which may not lie on the effective manifold containing $p(x)$; (2) use the diffusion process to sample from $p(x)$ in a ball around $x_0$, getting $x_1$ on the manifold but not matching $y$; and finally (3) use annealed Langevin dynamics to converge to $p(x \mid y)$. This works if $p(x)$ is locally close to log-concave, even if it is globally complicated. See \ref{['sec:compressed_sensing']} for a more detailed discussion.
  • Figure 3: Let $p=\mathcal{N}(0,1)$ and $y=x+\mathcal{N}(0,0.01)$. Starting from $X_0\sim p$, run the Langevin SDE $\mathrm{d}X_t = s_y(X_t)\,\mathrm{d}t + \sqrt{2}\,\mathrm{d}B_t.$ Averaging over $y$, the marginal of $X_t$ remains Gaussian; its variance first contracts and then returns toward the prior. There is an intermediate time $t^*$ where $X_{t^*}$ has a constant factor lower variance; in high dimensions, this means $X_{t^*}$ is concentrated on an exponentially small region of $p$, so an $L^p$ bound on score error under $p$ does not effectively control the error under $X_{t^*}$. See \ref{['sec:plain_langevin_app']} for details.
  • Figure 4: For each of the three settings (inpainting, super-resolution, and Gaussian deblurring), we plot the $L^2$ distance between samples obtained by our annealed Langevin method and the ground truth samples in red. We plot the FID of the distribution obtained by running annealed Langevin in blue. We plot the baseline $L^2$ distance and FID for samples obtained by the DPS algorithm using red and blue dashed lines.
  • Figure 5: A set of samples for the inpainting task.
  • ...and 1 more figures

Theorems & Definitions (126)

  • Theorem 1.1: Posterior sampling with global log-concavity
  • Theorem 1.2: Posterior sampling with local log-concavity
  • Corollary 1.2: Competitive compressed sensing
  • Lemma A.1
  • Lemma A.2
  • proof
  • Corollary A.3
  • proof
  • Lemma A.4
  • proof
  • ...and 116 more