Absorb and Converge: Provable Convergence Guarantee for Absorbing Discrete Diffusion Models
Yuchen Liang, Renxiang Huang, Lifeng Lai, Ness Shroff, Yingbin Liang
TL;DR
This paper tackles the theoretical gap in absorbing discrete diffusion models by delivering the first finite-time convergence guarantees for forward-backward Markovian diffusion under absorbing rate matrices. It introduces a surrogate initialization $p_{init}$ and uses a Jensen-based decomposition to bound the forward KL divergence, enabling stable analysis despite the absorbing singleton stationary distribution. The authors prove convergence bounds for both $\tau$-leaping and uniformization samplers, showing linear-in-$d$ dependence and improved rates over uniform-rate variants, and further provide no-early-stopping results under practical assumptions. Novel techniques include precise absorbing-score bounds and a non-diverging score near initialization, which collectively remove the need for early stopping in certain regimes. These results offer principled guidance for deploying absorbing discrete diffusion models in practice and lay groundwork for future extensions to conditional generation and broader absorbing-rate structures.
Abstract
Discrete state space diffusion models have shown significant advantages in applications involving discrete data, such as text and image generation. It has also been observed that their performance is highly sensitive to the choice of rate matrices, particularly between uniform and absorbing rate matrices. While empirical results suggest that absorbing rate matrices often yield better generation quality compared to uniform rate matrices, existing theoretical works have largely focused on the uniform rate matrices case. Notably, convergence guarantees and error analyses for absorbing diffusion models are still missing. In this work, we provide the first finite-time error bounds and convergence rate analysis for discrete diffusion models using absorbing rate matrices. We begin by deriving an upper bound on the KL divergence of the forward process, introducing a surrogate initialization distribution to address the challenge posed by the absorbing stationary distribution, which is a singleton and causes the KL divergence to be ill-defined. We then establish the first convergence guarantees for both the $τ$-leaping and uniformization samplers under absorbing rate matrices, demonstrating improved rates over their counterparts using uniform rate matrices. Furthermore, under suitable assumptions, we provide convergence guarantees without early stopping. Our analysis introduces several new technical tools to address challenges unique to absorbing rate matrices. These include a Jensen-type argument for bounding forward process convergence, novel techniques for bounding absorbing score functions, and a non-divergent upper bound on the score near initialization that removes the need of early-stopping.
