A Sharp Convergence Theory for The Probability Flow ODEs of Diffusion Models

Gen Li; Yuting Wei; Yuejie Chi; Yuxin Chen

A Sharp Convergence Theory for The Probability Flow ODEs of Diffusion Models

Gen Li, Yuting Wei, Yuejie Chi, Yuxin Chen

TL;DR

This work provides the first non-asymptotic, nearly linear-dimension convergence guarantee for the probability flow ODE sampler in diffusion models, under discrete-time dynamics and only ℓ2-accurate score estimates. It derives a precise TV-distance bound that scales as d/ε (up to log factors) when score estimates are exact, and characterizes how score estimation errors (both in the scores and their Jacobians) propagate into sampling error. The authors introduce an elementary, non-SDE/ODE-based analysis that handles discretization directly and avoids reliance on stochastic calculus, improving prior results by achieving better dependence on dimension and accuracy while accommodating minimal assumptions on the data distribution. The framework also clarifies the necessity of controlling Jacobian errors in addition to score errors and demonstrates robustness to data distributions with polynomially large supports. Overall, the paper advances theoretical understanding of deterministic diffusion samplers and offers a path toward faster, reliable score-based generation without expensive continuous-time machinery.

Abstract

Diffusion models, which convert noise into new data instances by learning to reverse a diffusion process, have become a cornerstone in contemporary generative modeling. In this work, we develop non-asymptotic convergence theory for a popular diffusion-based sampler (i.e., the probability flow ODE sampler) in discrete time, assuming access to $\ell_2$-accurate estimates of the (Stein) score functions. For distributions in $\mathbb{R}^d$, we prove that $d/\varepsilon$ iterations -- modulo some logarithmic and lower-order terms -- are sufficient to approximate the target distribution to within $\varepsilon$ total-variation distance. This is the first result establishing nearly linear dimension-dependency (in $d$) for the probability flow ODE sampler. Imposing only minimal assumptions on the target data distribution (e.g., no smoothness assumption is imposed), our results also characterize how $\ell_2$ score estimation errors affect the quality of the data generation processes. In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach without the need of resorting to SDE and ODE toolboxes.

A Sharp Convergence Theory for The Probability Flow ODEs of Diffusion Models

TL;DR

Abstract

-accurate estimates of the (Stein) score functions. For distributions in

, we prove that

iterations -- modulo some logarithmic and lower-order terms -- are sufficient to approximate the target distribution to within

total-variation distance. This is the first result establishing nearly linear dimension-dependency (in

) for the probability flow ODE sampler. Imposing only minimal assumptions on the target data distribution (e.g., no smoothness assumption is imposed), our results also characterize how

score estimation errors affect the quality of the data generation processes. In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach without the need of resorting to SDE and ODE toolboxes.

Paper Structure (74 sections, 9 theorems, 222 equations)

This paper contains 74 sections, 9 theorems, 222 equations.

Introduction
This paper.
Notation.
Preliminaries
Diffusion generative models
The forward process.
The reverse process.
The probability flow ODE
Convergence theory for the probability flow ODE sampler
Assumptions and learning rates
Score estimates.
Target data distributions.
Learning rate schedule.
Main results
Iteration complexity.
...and 59 more sections

Key Result

Theorem 1

Suppose that eq:assumption-data-bounded holds true. Assume that the score estimates $s_t(\cdot)$$(1 \le t \le T)$ satisfy Assumptions assumption:score-estimate and assumption:score-estimate-Jacobi. Then the sampling process eqn:ode-sampling with the learning rate schedule eqn:alpha-t satisfies for some universal constants $C_1>0$, provided that $T \ge C_2d^2\log^5 T$ for some large enough constan

Theorems & Definitions (14)

Definition 1: Score function
Theorem 1
Lemma 1
Lemma 2
Remark 1
Lemma 3
Lemma 4
proof
Remark 2
Lemma 5
...and 4 more

A Sharp Convergence Theory for The Probability Flow ODEs of Diffusion Models

TL;DR

Abstract

A Sharp Convergence Theory for The Probability Flow ODEs of Diffusion Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (14)