Table of Contents
Fetching ...

Non-asymptotic Analysis of Diffusion Annealed Langevin Monte Carlo for Generative Modelling

Paula Cordero-Encinar, O. Deniz Akyildiz, Andrew B. Duncan

TL;DR

This work develops a non-asymptotic theoretical framework for diffusion annealed Langevin Monte Carlo (DALMC) applied to score-based generative models. It analyzes general diffusion paths that interpolate between a simple base distribution $\nu$ (Gaussian or Student's $t$) and the data distribution $\pi_{data}$, and provides explicit KL bounds for the resulting sampling error under weak smoothness and moment assumptions. The Gaussian-diffusion path yields a concrete iteration complexity bound $M=\mathcal{O}\left( d (M_2\vee d)^2 L_{\max}^2 / \varepsilon^6 \right)$ to achieve $\varepsilon^2$ accuracy in KL, while extending the analysis to heavy-tailed (Student's $t$) paths shows comparable performance with tail-dependent constants. A key tool is the action of the diffusion path, together with Girsanov-based KL bounds, which together quantify discretisation bias and transport costs along the path. Overall, the results broaden the theoretical understanding of DALMC and prove new guarantees for both Gaussian and heavy-tailed diffusion models in finite time, with implications for practical score-based sampling and robustness to heavy-tailed data.

Abstract

We investigate the theoretical properties of general diffusion (interpolation) paths and their Langevin Monte Carlo implementation, referred to as diffusion annealed Langevin Monte Carlo (DALMC), under weak conditions on the data distribution. Specifically, we analyse and provide non-asymptotic error bounds for the annealed Langevin dynamics where the path of distributions is defined as Gaussian convolutions of the data distribution as in diffusion models. We then extend our results to recently proposed heavy-tailed (Student's t) diffusion paths, demonstrating their theoretical properties for heavy-tailed data distributions for the first time. Our analysis provides theoretical guarantees for a class of score-based generative models that interpolate between a simple distribution (Gaussian or Student's t) and the data distribution in finite time. This approach offers a broader perspective compared to standard score-based diffusion approaches, which are typically based on a forward Ornstein-Uhlenbeck (OU) noising process.

Non-asymptotic Analysis of Diffusion Annealed Langevin Monte Carlo for Generative Modelling

TL;DR

This work develops a non-asymptotic theoretical framework for diffusion annealed Langevin Monte Carlo (DALMC) applied to score-based generative models. It analyzes general diffusion paths that interpolate between a simple base distribution (Gaussian or Student's ) and the data distribution , and provides explicit KL bounds for the resulting sampling error under weak smoothness and moment assumptions. The Gaussian-diffusion path yields a concrete iteration complexity bound to achieve accuracy in KL, while extending the analysis to heavy-tailed (Student's ) paths shows comparable performance with tail-dependent constants. A key tool is the action of the diffusion path, together with Girsanov-based KL bounds, which together quantify discretisation bias and transport costs along the path. Overall, the results broaden the theoretical understanding of DALMC and prove new guarantees for both Gaussian and heavy-tailed diffusion models in finite time, with implications for practical score-based sampling and robustness to heavy-tailed data.

Abstract

We investigate the theoretical properties of general diffusion (interpolation) paths and their Langevin Monte Carlo implementation, referred to as diffusion annealed Langevin Monte Carlo (DALMC), under weak conditions on the data distribution. Specifically, we analyse and provide non-asymptotic error bounds for the annealed Langevin dynamics where the path of distributions is defined as Gaussian convolutions of the data distribution as in diffusion models. We then extend our results to recently proposed heavy-tailed (Student's t) diffusion paths, demonstrating their theoretical properties for heavy-tailed data distributions for the first time. Our analysis provides theoretical guarantees for a class of score-based generative models that interpolate between a simple distribution (Gaussian or Student's t) and the data distribution in finite time. This approach offers a broader perspective compared to standard score-based diffusion approaches, which are typically based on a forward Ornstein-Uhlenbeck (OU) noising process.

Paper Structure

This paper contains 41 sections, 22 theorems, 230 equations, 1 figure.

Key Result

Proposition 3.1

If $\mathop{\mathrm{\pi_{\text{data}}}}\limits$ has a finite log-Sobolev constant $C_{\text{LSI}}(\mathop{\mathrm{\pi_{\text{data}}}}\limits)$, respectively Poincaré constant $C_{\text{PI}}(\mathop{\mathrm{\pi_{\text{data}}}}\limits)$, the Gaussian diffusion path $(\mu_t)_{t \in [0,T]}$ defined in e respectively, where $C_{\text{LSI}}(\nu) = C_{\text{PI}}(\nu) = \sigma^{2}$.

Figures (1)

  • Figure 1: A visual comparison of the geometric path versus the diffusion path for $(\mu_t)_{t\in[0,1]}$. The base distribution is given by $\mu_0 := \mathcal{N}(0,1)$ and the data distribution, $\mu_1 := \mathop{\mathrm{\pi_{\text{data}}}}\limits$, is a mixture of a Gaussian and a smoothed uniform distribution (see Section \ref{['sec:gaussian_diffusion']}). As observed by chehab2024provableconvergencelimitationsgeometric, the geometric path in (a) creates intermediate multimodal distributions which are hard to sample from. In contrast, the diffusion path in (b) stays unimodal throughout, offering more favourable properties.

Theorems & Definitions (43)

  • Proposition 3.1
  • Lemma 3.2
  • Lemma 3.3
  • Theorem 3.4
  • Corollary 3.5
  • Theorem 3.6
  • Lemma 4.1
  • Lemma 4.2
  • Theorem 4.3
  • Lemma A.1: Lemma 2 from guo2024provablebenefitannealedlangevin
  • ...and 33 more