Almost Linear Convergence under Minimal Score Assumptions: Quantized Transition Diffusion
Xunpeng Huang, Yingyu Lin, Nikki Lijing Kuang, Hanze Dong, Difan Zou, Yian Ma, Tong Zhang
TL;DR
This work addresses inefficiencies in diffusion-based generative modeling by marrying discrete and continuous diffusion paradigms through Quantized Transition Diffusion (QTD). By histogram-quantizing the target distribution and embedding it into a binary, Hamming-graph space, QTD enables long-range forward transitions with sparse connectivity and employs truncated uniformization for an unbiased, efficient reverse-time sampling. The authors prove a theoretical TV-convergence guarantee with a score-evaluation complexity of $O(d \ln^2(d/\epsilon))$ under minimal score assumptions, advancing the theoretical foundations of diffusion modeling. Overall, QTD offers a principled, scalable framework that unifies discrete and continuous diffusion schemes and yields potential gains in sampling efficiency for high-dimensional data.
Abstract
Continuous diffusion models have demonstrated remarkable performance in data generation across various domains, yet their efficiency remains constrained by two critical limitations: (1) the local adjacency structure of the forward Markov process, which restricts long-range transitions in the data space, and (2) inherent biases introduced during the simulation of time-inhomogeneous reverse denoising processes. To address these challenges, we propose Quantized Transition Diffusion (QTD), a novel approach that integrates data quantization with discrete diffusion dynamics. Our method first transforms the continuous data distribution $p_*$ into a discrete one $q_*$ via histogram approximation and binary encoding, enabling efficient representation in a structured discrete latent space. We then design a continuous-time Markov chain (CTMC) with Hamming distance-based transitions as the forward process, which inherently supports long-range movements in the original data space. For reverse-time sampling, we introduce a \textit{truncated uniformization} technique to simulate the reverse CTMC, which can provably provide unbiased generation from $q_*$ under minimal score assumptions. Through a novel KL dynamic analysis of the reverse CTMC, we prove that QTD can generate samples with $O(d\ln^2(d/ε))$ score evaluations in expectation to approximate the $d$--dimensional target distribution $p_*$ within an $ε$ error tolerance. Our method not only establishes state-of-the-art inference efficiency but also advances the theoretical foundations of diffusion-based generative modeling by unifying discrete and continuous diffusion paradigms.
