Table of Contents
Fetching ...

Fast Sampling via Discrete Non-Markov Diffusion Models with Predetermined Transition Time

Zixiang Chen, Huizhuo Yuan, Yongqian Li, Yiwen Kou, Junkai Zhang, Quanquan Gu

TL;DR

Problem: slow sampling in discrete diffusion models limits practical use for text generation and machine translation. Approach: discrete non-Markov diffusion model with a predetermined transition-time set that enables training-free, deterministic reverse sampling, plus a continuous-time infinite-step variant DNDM-C. Contributions: (i) a theoretically grounded DNDM that preserves q(x_t) and q(x0|x_t) and reduces neural evaluations, (ii) an accelerated reverse sampler with NFE ~ |T|, (iii) the infinite-step sampling insight bridging discrete and continuous-time diffusion, and (iv) extensive language generation experiments showing large speedups with competitive quality. Impact: enables fast discrete diffusion for NLP tasks and provides a unified framework for discrete-transition-time sampling; future work extends to audio/image.

Abstract

Discrete diffusion models have emerged as powerful tools for high-quality data generation. Despite their success in discrete spaces, such as text generation tasks, the acceleration of discrete diffusion models remains under-explored. In this paper, we propose discrete non-Markov diffusion models (DNDM), which naturally induce the predetermined transition time set. This enables a training-free sampling algorithm that significantly reduces the number of function evaluations (i.e., calls to the neural network), making the sampling process much faster. Furthermore, we study the transition from finite to infinite step sampling, offering new insights into bridging the gap between discrete and continuous-time processes for discrete diffusion models. Extensive experiments on natural language generation and machine translation tasks demonstrate the superior performance of our method in terms of both generation speed and sample quality compared to existing methods for discrete diffusion models.

Fast Sampling via Discrete Non-Markov Diffusion Models with Predetermined Transition Time

TL;DR

Problem: slow sampling in discrete diffusion models limits practical use for text generation and machine translation. Approach: discrete non-Markov diffusion model with a predetermined transition-time set that enables training-free, deterministic reverse sampling, plus a continuous-time infinite-step variant DNDM-C. Contributions: (i) a theoretically grounded DNDM that preserves q(x_t) and q(x0|x_t) and reduces neural evaluations, (ii) an accelerated reverse sampler with NFE ~ |T|, (iii) the infinite-step sampling insight bridging discrete and continuous-time diffusion, and (iv) extensive language generation experiments showing large speedups with competitive quality. Impact: enables fast discrete diffusion for NLP tasks and provides a unified framework for discrete-transition-time sampling; future work extends to audio/image.

Abstract

Discrete diffusion models have emerged as powerful tools for high-quality data generation. Despite their success in discrete spaces, such as text generation tasks, the acceleration of discrete diffusion models remains under-explored. In this paper, we propose discrete non-Markov diffusion models (DNDM), which naturally induce the predetermined transition time set. This enables a training-free sampling algorithm that significantly reduces the number of function evaluations (i.e., calls to the neural network), making the sampling process much faster. Furthermore, we study the transition from finite to infinite step sampling, offering new insights into bridging the gap between discrete and continuous-time processes for discrete diffusion models. Extensive experiments on natural language generation and machine translation tasks demonstrate the superior performance of our method in terms of both generation speed and sample quality compared to existing methods for discrete diffusion models.
Paper Structure (27 sections, 3 theorems, 42 equations, 8 figures, 13 tables, 4 algorithms)

This paper contains 27 sections, 3 theorems, 42 equations, 8 figures, 13 tables, 4 algorithms.

Key Result

Theorem 3.1

For the non-Markov process in eq:2, we have where $\alpha_t: = \Pi_{i=1}^{s}\beta_s$ is specified to decrease from $1$ to $0$.

Figures (8)

  • Figure 1: Generation quality to generation time comparison on $\mathtt{IWSLT14}$. $x$-axis: computational time in seconds; $y$-axis: BLEU score.
  • Figure 2: We demonstrate the 100-step generation process of DNDM-$k$-Multi as an example, where the left is the change of the BLEU score along the generation process, and the right is the text at different time steps. As the time goes from 100 to 0, noise is gradually removed until the corresponding English text emerges. Since the transition time follows a Beta distribution as described in Section \ref{['sec:fast']}, the majority of transitions occur near the starting time.
  • Figure 3: Different distribution of transition time for $T = 50$. $a), b), c)$ The transition time sampled 1K times under the different $\alpha_{t}$ schedule. d) The approximated transition time for $t = 1, \ldots, T$ using different hypter-parameters.
  • Figure 4: The growth of computational time with the increase of the sampling steps
  • Figure 5: Text in the Generation Process
  • ...and 3 more figures

Theorems & Definitions (12)

  • Theorem 3.1
  • Definition 3.2
  • Remark 3.3
  • Remark 3.4
  • Remark 3.5
  • Theorem 3.6
  • Remark 3.7
  • Remark B.1
  • Theorem D.1
  • Remark D.2
  • ...and 2 more