Diffusion Models under Alternative Noise: Simplified Analysis and Sensitivity
Juhyeok Choi, Chenglin Fan
TL;DR
Diffusion models are framed as discretized variance-preserving SDEs, and the authors provide a simplified Grönwall-based analysis that establishes a strong convergence rate of $O(T^{-1/2})$ for Euler–Maruyama discretization under standard Lipschitz conditions. They show that Gaussian noise in the reverse-time sampling can be replaced by symmetric discrete noise with matched first two moments without affecting the convergence rate, and they validate this claim empirically on MNIST and CIFAR-10. The analysis extends to multidimensional settings with a bound scaling as $O(d^{1/2} h^{1/2})$, implying a sampling complexity of $T = O(d/\varepsilon^2)$. The results offer practical benefits for hardware-constrained deployment by enabling noise substitution with low-cost RNGs while preserving sample quality, underscored by experiments demonstrating near-parity with Gaussian noise in standard benchmarks.
Abstract
Diffusion models, typically formulated as discretizations of stochastic differential equations (SDEs), have achieved state-of-the-art performance in generative tasks. However, their theoretical analysis often involves complex proofs. In this work, we present a simplified framework for analyzing the Euler--Maruyama discretization of variance-preserving SDEs (VP-SDEs). Using Grönwall's inequality, we derive a convergence rate of $O(T^{-1/2})$ under standard Lipschitz assumptions, streamlining prior analyses. We then demonstrate that the standard Gaussian noise can be replaced by computationally cheaper discrete random variables (e.g., Rademacher) without sacrificing this convergence guarantee, provided the mean and variance are matched. Our experiments validate this theory, showing that (i) discrete noise achieves sample quality comparable to Gaussian noise provided the variance is matched correctly, and (ii) performance degrades if the noise variance is scaled incorrectly.
