Non-asymptotic Analysis of Diffusion Annealed Langevin Monte Carlo for Generative Modelling
Paula Cordero-Encinar, O. Deniz Akyildiz, Andrew B. Duncan
TL;DR
This work develops a non-asymptotic theoretical framework for diffusion annealed Langevin Monte Carlo (DALMC) applied to score-based generative models. It analyzes general diffusion paths that interpolate between a simple base distribution $\nu$ (Gaussian or Student's $t$) and the data distribution $\pi_{data}$, and provides explicit KL bounds for the resulting sampling error under weak smoothness and moment assumptions. The Gaussian-diffusion path yields a concrete iteration complexity bound $M=\mathcal{O}\left( d (M_2\vee d)^2 L_{\max}^2 / \varepsilon^6 \right)$ to achieve $\varepsilon^2$ accuracy in KL, while extending the analysis to heavy-tailed (Student's $t$) paths shows comparable performance with tail-dependent constants. A key tool is the action of the diffusion path, together with Girsanov-based KL bounds, which together quantify discretisation bias and transport costs along the path. Overall, the results broaden the theoretical understanding of DALMC and prove new guarantees for both Gaussian and heavy-tailed diffusion models in finite time, with implications for practical score-based sampling and robustness to heavy-tailed data.
Abstract
We investigate the theoretical properties of general diffusion (interpolation) paths and their Langevin Monte Carlo implementation, referred to as diffusion annealed Langevin Monte Carlo (DALMC), under weak conditions on the data distribution. Specifically, we analyse and provide non-asymptotic error bounds for the annealed Langevin dynamics where the path of distributions is defined as Gaussian convolutions of the data distribution as in diffusion models. We then extend our results to recently proposed heavy-tailed (Student's t) diffusion paths, demonstrating their theoretical properties for heavy-tailed data distributions for the first time. Our analysis provides theoretical guarantees for a class of score-based generative models that interpolate between a simple distribution (Gaussian or Student's t) and the data distribution in finite time. This approach offers a broader perspective compared to standard score-based diffusion approaches, which are typically based on a forward Ornstein-Uhlenbeck (OU) noising process.
