Non-Asymptotic Convergence of Discrete Diffusion Models: Masked and Random Walk dynamics
Giovanni Conforti, Alain Durmus, Le-Tuyet-Nhi Pham, Gael Raoul
TL;DR
This work develops a unified, non-asymptotic theory for discrete diffusion models operating on both finite and countably infinite state spaces. By modeling the forward noising dynamics as continuous-time Markov chains and expressing backward dynamics through discrete scores, the authors derive sharp KL and total-variation bounds for three DDMs: random walk on Z^d_m, masked diffusion on Z^d_m, and biased random walk on N^d. A key novelty is the score-monotonicity analysis along the backward dynamics, enabling error decompositions that avoid bounded-score assumptions and yield near-linear complexity in the dimension (up to logarithmic factors). The paper also introduces early-stopping schemes to relax regularity requirements, yielding robust convergence guarantees in practice. Overall, the results provide a principled, scalable framework for discrete diffusion modeling with rigorous non-asymptotic guarantees.
Abstract
Diffusion models for continuous state spaces based on Gaussian noising processes are now relatively well understood, as many works have focused on their theoretical analysis. In contrast, results for diffusion models on discrete state spaces remain limited and pose significant challenges, particularly due to their combinatorial structure and their more recent introduction in generative modelling. In this work, we establish new and sharp convergence guarantees for three popular discrete diffusion models (DDMs). Two of these models are designed for finite state spaces and are based respectively on the random walk and the masking process. The third DDM we consider is defined on the countably infinite space $\mathbb{N}^d$ and uses a drifted random walk as its forward process. For each of these models, the backward process can be characterized by a discrete score function that can, in principle, be estimated. However, even with perfect access to these scores, simulating the exact backward process is infeasible, and one must rely on approximations. In this work, we study Euler-type approximations and establish convergence bounds in both Kullback-Leibler divergence and total variation distance for the resulting models, under minimal assumptions on the data distribution. In particular, we show that the computational complexity of each method scales linearly in the dimension, up to logarithmic factors. Furthermore, to the best of our knowledge, this study provides the first non-asymptotic convergence guarantees for these noising processes that do not rely on boundedness assumptions on the estimated score.
