Nearly $d$-Linear Convergence Bounds for Diffusion Models via Stochastic Localization
Joe Benton, Valentin De Bortoli, Arnaud Doucet, George Deligiannidis
TL;DR
This work closes the gap on the theoretical understanding of diffusion-model convergence by proving bounds that scale linearly with data dimension (up to logarithmic factors) under only finite second moments, removing the need for strong smoothness assumptions.The authors combine Girsanov-based KL analysis with a refined discretization-error treatment inspired by stochastic localization, introducing a key lemma to control covariance terms and enable tight path-measure comparisons.Under an appropriate score-estimation error bound and early stopping, they show the diffusion process requires at most $ ilde{O}igl(rac{d \, ext{log}^2(1/oldsymbol{ au})}{oldsymbol{ ext{epsilon}}^2}igr)$ steps to approximate an arbitrary distribution to $oldsymbol{KL}$ error $oldsymbol{ ext{epsilon}}^2}$, addressing the previously observed quadratic-in-$d$ gap.The results imply that diffusion-based sampling can scale more favorably with dimension in theory, matching intuition from stochastic localization and offering practical guidance for high-dimensional generative modeling.
Abstract
Denoising diffusions are a powerful method to generate approximate samples from high-dimensional data distributions. Recent results provide polynomial bounds on their convergence rate, assuming $L^2$-accurate scores. Until now, the tightest bounds were either superlinear in the data dimension or required strong smoothness assumptions. We provide the first convergence bounds which are linear in the data dimension (up to logarithmic factors) assuming only finite second moments of the data distribution. We show that diffusion models require at most $\tilde O(\frac{d \log^2(1/δ)}{\varepsilon^2})$ steps to approximate an arbitrary distribution on $\mathbb{R}^d$ corrupted with Gaussian noise of variance $δ$ to within $\varepsilon^2$ in KL divergence. Our proof extends the Girsanov-based methods of previous works. We introduce a refined treatment of the error from discretizing the reverse SDE inspired by stochastic localization.
