Table of Contents
Fetching ...

Discrete vs. Continuous Trade-offs for Generative Models

Jathin Korrapati, Tanish Baranwal, Rahul Shah

TL;DR

The paper investigates the trade-offs between discrete-time denoising diffusion probabilistic models and their continuous diffusion counterparts by deriving a discrete Girsanov-based bound on the KL divergence between the true reverse process and the score-estimated reverse process. It shows how score estimation errors propagate through the reverse dynamics and bounds distributional distance using Pinsker's inequality and the data processing inequality, yielding an overall error bound of the form $O\big(\sqrt{T}\,\epsilon_{score} + e^{-T}\sqrt{\mathrm{KL}(q\|\gamma^d)}\big)$. A discrete Girsanov theorem is developed to express the KL divergence in terms of drift mismatches, enabling rigorous analysis and practical guidance for discrete diffusion methods like DDPMs. The results illuminate how discrete-time implementations can achieve strong performance with controllable error, while also highlighting the theoretical richness of continuous-time formulations. Overall, the work provides a principled framework for bounding and comparing discrete and continuous diffusion models and informs efficient algorithm design for generative modeling.

Abstract

This work explores the theoretical and practical foundations of denoising diffusion probabilistic models (DDPMs) and score-based generative models, which leverage stochastic processes and Brownian motion to model complex data distributions. These models employ forward and reverse diffusion processes defined through stochastic differential equations (SDEs) to iteratively add and remove noise, enabling high-quality data generation. By analyzing the performance bounds of these models, we demonstrate how score estimation errors propagate through the reverse process and bound the total variation distance using discrete Girsanov transformations, Pinsker's inequality, and the data processing inequality (DPI) for an information theoretic lens.

Discrete vs. Continuous Trade-offs for Generative Models

TL;DR

The paper investigates the trade-offs between discrete-time denoising diffusion probabilistic models and their continuous diffusion counterparts by deriving a discrete Girsanov-based bound on the KL divergence between the true reverse process and the score-estimated reverse process. It shows how score estimation errors propagate through the reverse dynamics and bounds distributional distance using Pinsker's inequality and the data processing inequality, yielding an overall error bound of the form . A discrete Girsanov theorem is developed to express the KL divergence in terms of drift mismatches, enabling rigorous analysis and practical guidance for discrete diffusion methods like DDPMs. The results illuminate how discrete-time implementations can achieve strong performance with controllable error, while also highlighting the theoretical richness of continuous-time formulations. Overall, the work provides a principled framework for bounding and comparing discrete and continuous diffusion models and informs efficient algorithm design for generative modeling.

Abstract

This work explores the theoretical and practical foundations of denoising diffusion probabilistic models (DDPMs) and score-based generative models, which leverage stochastic processes and Brownian motion to model complex data distributions. These models employ forward and reverse diffusion processes defined through stochastic differential equations (SDEs) to iteratively add and remove noise, enabling high-quality data generation. By analyzing the performance bounds of these models, we demonstrate how score estimation errors propagate through the reverse process and bound the total variation distance using discrete Girsanov transformations, Pinsker's inequality, and the data processing inequality (DPI) for an information theoretic lens.
Paper Structure (15 sections, 34 equations, 6 figures, 1 table)

This paper contains 15 sections, 34 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Here we see that the $\mathcal{KL}$ divergence is clearly linear over time
  • Figure 2: Sample Trajectories
  • Figure 3: Forward and Reverse Trajectories from original $q$ to $q_T$
  • Figure 4: Original Distribution vs. Recovered Distribution of Data
  • Figure 5: Here we see how drift mismatches contribute to global errors via the Cumulative $\mathcal{KL}$ Divergence Plot, as demonstrated by eq. \ref{['eq:dP-dQ']}
  • ...and 1 more figures