Table of Contents
Fetching ...

Discrete Modeling via Boundary Conditional Diffusion Processes

Yuxuan Gu, Xiaocheng Feng, Lei Huang, Yingsheng Wu, Zekun Zhou, Weihong Zhong, Kun Zhu, Bing Qin

TL;DR

An novel framework for efficiently and effectively extending the powerful continuous diffusion processes to discrete modeling that achieves comparable results to continuous diffusion models when using discrete ordinal pixels and establishes a new state-of-the-art for categorical image generation on the Cifar-10 dataset.

Abstract

We present an novel framework for efficiently and effectively extending the powerful continuous diffusion processes to discrete modeling. Previous approaches have suffered from the discrepancy between discrete data and continuous modeling. Our study reveals that the absence of guidance from discrete boundaries in learning probability contours is one of the main reasons. To address this issue, we propose a two-step forward process that first estimates the boundary as a prior distribution and then rescales the forward trajectory to construct a boundary conditional diffusion model. The reverse process is proportionally adjusted to guarantee that the learned contours yield more precise discrete data. Experimental results indicate that our approach achieves strong performance in both language modeling and discrete image generation tasks. In language modeling, our approach surpasses previous state-of-the-art continuous diffusion language models in three translation tasks and a summarization task, while also demonstrating competitive performance compared to auto-regressive transformers. Moreover, our method achieves comparable results to continuous diffusion models when using discrete ordinal pixels and establishes a new state-of-the-art for categorical image generation on the Cifar-10 dataset.

Discrete Modeling via Boundary Conditional Diffusion Processes

TL;DR

An novel framework for efficiently and effectively extending the powerful continuous diffusion processes to discrete modeling that achieves comparable results to continuous diffusion models when using discrete ordinal pixels and establishes a new state-of-the-art for categorical image generation on the Cifar-10 dataset.

Abstract

We present an novel framework for efficiently and effectively extending the powerful continuous diffusion processes to discrete modeling. Previous approaches have suffered from the discrepancy between discrete data and continuous modeling. Our study reveals that the absence of guidance from discrete boundaries in learning probability contours is one of the main reasons. To address this issue, we propose a two-step forward process that first estimates the boundary as a prior distribution and then rescales the forward trajectory to construct a boundary conditional diffusion model. The reverse process is proportionally adjusted to guarantee that the learned contours yield more precise discrete data. Experimental results indicate that our approach achieves strong performance in both language modeling and discrete image generation tasks. In language modeling, our approach surpasses previous state-of-the-art continuous diffusion language models in three translation tasks and a summarization task, while also demonstrating competitive performance compared to auto-regressive transformers. Moreover, our method achieves comparable results to continuous diffusion models when using discrete ordinal pixels and establishes a new state-of-the-art for categorical image generation on the Cifar-10 dataset.

Paper Structure

This paper contains 44 sections, 48 equations, 10 figures, 8 tables, 3 algorithms.

Figures (10)

  • Figure 1: (A) Blue and green curves are the learned probability density contours of the diffusion model for two data points. The red area is the discrete area of the blue data $\mathbf{x}_0$ and the boundary of this area is naturally a density contour. The discrete boundary is a complex hypersurface in the high-dimensional continuous space and we simplify it into a red line for convenience of description. As observed in the magnified part, the learned contours deviate from the boundary contour, resulting in inconsistent probability densities and gradient directions. (B) We consider the discrete boundary as priors for the diffusion process to estimate a more appropriate probability distribution, where the learned contours are expected to follow the shape of the discrete boundary.
  • Figure 2: (A) Rescaled Probability Contours. The bold curve $1\sigma$ is the density contour of one standard deviation. As the time $t$ decreases from $T$ to $0$, the rescaled contours will gradually fit the discrete boundary and probability densities will also concentrate to this boundary. (B) Rescaled Forward Trajectory. Original forward trajectory $\mathbf{x}_0\!\rightarrow\!\mathbf{x}_{t_0}\!\rightarrow\!\mathbf{x}_\tau$ is rescaled to be a boundary conditional trajectory $\tilde{\mathbf{x}}_1\!\rightarrow\!\tilde{\mathbf{x}}_t$ that starts from $\tilde{\mathbf{x}}_1=\mathbf{x}_{t_0}$. The rescaled forward distribution $\tilde{p}_t(\tilde{\mathbf{x}}_t|\mathbf{x}_0)$ is transformed from the discrete boundary to Gaussian distributions.
  • Figure 3: Generated images of Bit Diffusion repro, DDIM, and Ours on Cifar-10.
  • Figure 4: We demonstrate the trajectory differences among Markovian Diffusion Process, Deterministic Diffusion and Flow Matching.
  • Figure 5: Generated Binary Coding images of reproduced Bit Diffusion and Ours on Cifar-10.
  • ...and 5 more figures