Bit-Level Discrete Diffusion with Markov Probabilistic Models: An Improved Framework with Sharp Convergence Bounds under Minimal Assumptions
Le-Tuyet-Nhi Pham, Dario Shariatian, Antonio Ocello, Giovanni Conforti, Alain Durmus
TL;DR
This work extends score-based generative modeling to discrete data by formulating a forward CTMC on the hypercube $\{0,1\}^d$ and a tractable time-reversed process. A discrete score, defined as a conditional expectation, is learned via an $\mathrm{L}^2$ projection with a denoiser-based reparameterization, enabling stable training with a simple regression target. The authors prove non-asymptotic convergence bounds for DMPMs under minimal assumptions, showing linear-in-dimension sampling error and providing early-stopping refinements to tighten the bound; they also demonstrate competitive performance on low- and high-dimensional discrete data, including binarized MNIST, with efficient sampling. The combination of a principled time-reversal derivation, a practical training objective, and rigorous convergence guarantees yields a scalable and theoretically grounded framework for discrete generative modeling with real-world impact on discrete structure synthesis.
Abstract
This paper introduces Discrete Markov Probabilistic Models (DMPMs), a novel discrete diffusion algorithm for discrete data generation. The algorithm operates in discrete bit space, where the noising process is a continuous-time Markov chain that flips labels uniformly at random. The time-reversal process, like the forward noise process, is a jump process with its intensity governed by a discrete analogue of the classical score function. Crucially, this intensity is proven to be the conditional expectation of a function of the forward process, underlining theoretical alignment with score-based generative models. We establish convergence bounds for the algorithm under minimal assumptions, ensuring robustness and efficiency, which we demonstrate through experiments on low-dimensional Bernoulli-distributed datasets and high-dimensional binary MNIST data. The results highlight competitive performance in generating discrete structures compared to the state-of-the-art. This work bridges theoretical foundations and practical applications, advancing the development of effective and theoretically grounded discrete generative modeling.
