Table of Contents
Fetching ...

How Discrete and Continuous Diffusion Meet: Comprehensive Analysis of Discrete Diffusion Models via a Stochastic Integral Framework

Yinuo Ren, Haoxuan Chen, Grant M. Rotskoff, Lexing Ying

TL;DR

The paper addresses the limited theoretical analysis of discrete diffusion models by introducing a Lévy-type stochastic-integral framework built on Poisson random measures with evolving intensity. This approach yields a unified path-wise formulation for both forward and backward discrete diffusions and enables a change-of-measure analysis that links inference losses to KL divergence. It derives the first KL-based error bound for the τ-leaping scheme and provides a comprehensive comparison with the uniformization method, including explicit step bounds and complexity, under a set of practical assumptions. The framework relaxes several prior stringent conditions and establishes a principled basis for designing efficient, accurate discrete diffusion algorithms with rigorous error control across real-world applications. Overall, the work bridges discrete and continuous diffusion theories, offering tools to analyze and improve discrete diffusion models in diverse domains.

Abstract

Discrete diffusion models have gained increasing attention for their ability to model complex distributions with tractable sampling and inference. However, the error analysis for discrete diffusion models remains less well-understood. In this work, we propose a comprehensive framework for the error analysis of discrete diffusion models based on Lévy-type stochastic integrals. By generalizing the Poisson random measure to that with a time-independent and state-dependent intensity, we rigorously establish a stochastic integral formulation of discrete diffusion models and provide the corresponding change of measure theorems that are intriguingly analogous to Itô integrals and Girsanov's theorem for their continuous counterparts. Our framework unifies and strengthens the current theoretical results on discrete diffusion models and obtains the first error bound for the $τ$-leaping scheme in KL divergence. With error sources clearly identified, our analysis gives new insight into the mathematical properties of discrete diffusion models and offers guidance for the design of efficient and accurate algorithms for real-world discrete diffusion model applications.

How Discrete and Continuous Diffusion Meet: Comprehensive Analysis of Discrete Diffusion Models via a Stochastic Integral Framework

TL;DR

The paper addresses the limited theoretical analysis of discrete diffusion models by introducing a Lévy-type stochastic-integral framework built on Poisson random measures with evolving intensity. This approach yields a unified path-wise formulation for both forward and backward discrete diffusions and enables a change-of-measure analysis that links inference losses to KL divergence. It derives the first KL-based error bound for the τ-leaping scheme and provides a comprehensive comparison with the uniformization method, including explicit step bounds and complexity, under a set of practical assumptions. The framework relaxes several prior stringent conditions and establishes a principled basis for designing efficient, accurate discrete diffusion algorithms with rigorous error control across real-world applications. Overall, the work bridges discrete and continuous diffusion theories, offering tools to analyze and improve discrete diffusion models in diverse domains.

Abstract

Discrete diffusion models have gained increasing attention for their ability to model complex distributions with tractable sampling and inference. However, the error analysis for discrete diffusion models remains less well-understood. In this work, we propose a comprehensive framework for the error analysis of discrete diffusion models based on Lévy-type stochastic integrals. By generalizing the Poisson random measure to that with a time-independent and state-dependent intensity, we rigorously establish a stochastic integral formulation of discrete diffusion models and provide the corresponding change of measure theorems that are intriguingly analogous to Itô integrals and Girsanov's theorem for their continuous counterparts. Our framework unifies and strengthens the current theoretical results on discrete diffusion models and obtains the first error bound for the -leaping scheme in KL divergence. With error sources clearly identified, our analysis gives new insight into the mathematical properties of discrete diffusion models and offers guidance for the design of efficient and accurate algorithms for real-world discrete diffusion model applications.
Paper Structure (37 sections, 29 theorems, 147 equations, 1 figure, 2 algorithms)

This paper contains 37 sections, 29 theorems, 147 equations, 1 figure, 2 algorithms.

Key Result

Theorem 2.1

Suppose the time discretization scheme $(s_i)_{i\in[0, N]}$ with $s_0 = 0$ and $s_N = T - \delta$ satisfies $s_{k+1} - s_k \leq \kappa (T - s_{k+1})$ for $k\in[0:N-1]$. Assume $\mathrm{cov}(p_0) = {\bm{I}}$, and the score function $\nabla \log p_t({\bm{x}}_t)$ is estimated by the neural network $\wi Then under the following choice of the order of parameters we have $D_{\mathrm{KL}}(p_\delta \| \w

Figures (1)

  • Figure 1: Example trajectories of stochastic integrals (\ref{['eq:forward_integral']}) w.r.t. Poisson random measure with different evolving intensities. The intensity is chosen as $\lambda_t(y) = 50f_t$ if $|y-x_{t^-}| = 1$ or otherwise $0$, as shown in dashed lines. Intuitively, $\lambda_t$ controls the rate of jumps at time $t$ and location $y$.

Theorems & Definitions (70)

  • Theorem 2.1: Error Analysis of Continuous Diffusion Models
  • Definition 3.1: Poisson Random Measure with Evolving Intensity
  • Proposition 3.2: Stochastic Integral Formulation of Discrete Diffusion Models
  • Theorem 3.3: Change of Measure for Poisson Random Measure with Evolving Density
  • Corollary 3.4: Equivalence between KL Divergence and Score Entropy-based Loss Function
  • Proposition 4.1: Stochastic Integral Formulation of $\tau$-Leaping
  • Proposition 4.2: Stochastic Integral Formulation of Uniformization
  • Theorem 4.7: Error Analysis of $\tau$-Leaping
  • Remark 4.8: Remark on Early Stopping
  • Theorem 4.9: Error Analysis of Uniformization
  • ...and 60 more