On the $ε$-Free Inference Complexity of Absorbing Discrete Diffusion
Xunpeng Huang, Yingyu Lin, Nishant Jain, Kaibo Wang, Difan Zou, Yian Ma, Tong Zhang
TL;DR
AATU is introduced and it is proved that AATU achieves TV convergence with complexity-independent of the error tolerance $\epsilon$-thereby strictly outperforming existing uniform baselines, and eliminates the restrictive bounded-score assumption commonly required in prior studies of uniformization-based inference.
Abstract
Absorbing discrete diffusion has emerged as a dominant framework for discrete data generation. However, a significant disparity remains between its empirical success and theoretical understanding: existing analyses fail to demonstrate a complexity advantage over the $\mathcal{O}(d \ln(d/ε))$ baseline established for \emph{uniform} discrete diffusion. We bridge this gap by identifying a critical structural advantage: whereas uniform diffusion redundantly re-denoises valid elements, the absorbing scheme denoises each absorbing state exactly once. Leveraging this insight, we introduce \emph{Absorbing-Aware Truncated Uniformization} (AATU). We prove that AATU achieves $ε$-TV convergence with $\mathcal{O}(d \ln d)$ complexity-\emph{independent} of the error tolerance $ε$-thereby strictly outperforming existing uniform baselines. Beyond improving convergence rates, our analysis eliminates the restrictive bounded-score assumption commonly required in prior studies of uniformization-based inference. Furthermore, we extend AATU to time-invariant parameterizations, showing that it naturally adopts an imputation-type inference with a uniformly randomized denoising order. When combined with a lazy update strategy, TV convergence requires only $\mathcal{O}(d)$ discrete score evaluations. These results not only establish a rigorous foundation for absorbing discrete diffusion -- confirming its efficiency in high-accuracy generation -- but also open new avenues for analyzing diffusion-based language models under the masking paradigm.
