Discrete vs. Continuous Trade-offs for Generative Models

Jathin Korrapati; Tanish Baranwal; Rahul Shah

Discrete vs. Continuous Trade-offs for Generative Models

Jathin Korrapati, Tanish Baranwal, Rahul Shah

TL;DR

The paper investigates the trade-offs between discrete-time denoising diffusion probabilistic models and their continuous diffusion counterparts by deriving a discrete Girsanov-based bound on the KL divergence between the true reverse process and the score-estimated reverse process. It shows how score estimation errors propagate through the reverse dynamics and bounds distributional distance using Pinsker's inequality and the data processing inequality, yielding an overall error bound of the form $O\big(\sqrt{T}\,\epsilon_{score} + e^{-T}\sqrt{\mathrm{KL}(q\|\gamma^d)}\big)$. A discrete Girsanov theorem is developed to express the KL divergence in terms of drift mismatches, enabling rigorous analysis and practical guidance for discrete diffusion methods like DDPMs. The results illuminate how discrete-time implementations can achieve strong performance with controllable error, while also highlighting the theoretical richness of continuous-time formulations. Overall, the work provides a principled framework for bounding and comparing discrete and continuous diffusion models and informs efficient algorithm design for generative modeling.

Abstract

This work explores the theoretical and practical foundations of denoising diffusion probabilistic models (DDPMs) and score-based generative models, which leverage stochastic processes and Brownian motion to model complex data distributions. These models employ forward and reverse diffusion processes defined through stochastic differential equations (SDEs) to iteratively add and remove noise, enabling high-quality data generation. By analyzing the performance bounds of these models, we demonstrate how score estimation errors propagate through the reverse process and bound the total variation distance using discrete Girsanov transformations, Pinsker's inequality, and the data processing inequality (DPI) for an information theoretic lens.

Discrete vs. Continuous Trade-offs for Generative Models

TL;DR

. A discrete Girsanov theorem is developed to express the KL divergence in terms of drift mismatches, enabling rigorous analysis and practical guidance for discrete diffusion methods like DDPMs. The results illuminate how discrete-time implementations can achieve strong performance with controllable error, while also highlighting the theoretical richness of continuous-time formulations. Overall, the work provides a principled framework for bounding and comparing discrete and continuous diffusion models and informs efficient algorithm design for generative modeling.

Abstract

Paper Structure (15 sections, 34 equations, 6 figures, 1 table)

This paper contains 15 sections, 34 equations, 6 figures, 1 table.

Introduction and Background
Background
Background on denoising diffusion probabilistic modeling
Score Generative Models
TVD and W2 Distances
Total Variation Distance (TVD).
Wasserstein-2 Distance ($W_2$).
Brownian Motion
Introduction
Analysis Approach
Statement of Girsanov's Theorem
Proof of Girsanov's Discrete theorem
Implementation & Results
Benefits & Drawbacks between Discrete and Continuous Methods
Conclusion

Figures (6)

Figure 1: Here we see that the $\mathcal{KL}$ divergence is clearly linear over time
Figure 2: Sample Trajectories
Figure 3: Forward and Reverse Trajectories from original $q$ to $q_T$
Figure 4: Original Distribution vs. Recovered Distribution of Data
Figure 5: Here we see how drift mismatches contribute to global errors via the Cumulative $\mathcal{KL}$ Divergence Plot, as demonstrated by eq. \ref{['eq:dP-dQ']}
...and 1 more figures

Discrete vs. Continuous Trade-offs for Generative Models

TL;DR

Abstract

Discrete vs. Continuous Trade-offs for Generative Models

Authors

TL;DR

Abstract

Table of Contents

Figures (6)