Table of Contents
Fetching ...

On the Trajectory Regularity of ODE-based Diffusion Sampling

Defang Chen, Zhenyu Zhou, Can Wang, Chunhua Shen, Siwei Lyu

TL;DR

The paper investigates the trajectory regularity of ODE-based diffusion sampling, revealing a boomerang-shaped, largely straight sampling path governed by an implicit denoising trajectory. By modeling data with kernel density estimation, it derives a closed-form denoising solution linked to annealed mean shift and demonstrates that the sampling length scales as $\sigma_T\sqrt{d}$ with a nearly constant vector-field magnitude. Leveraging this regularity, it introduces Geometry-Inspired Time Scheduling (GITS), a dynamic-programming approach that re-allocates time steps using a small warmup, yielding substantial FID gains at few function evaluations with minimal overhead. The method provides theoretical and empirical insights into the sampling mechanism, offering a practical, fast route to higher-quality image generation on standard benchmarks. Overall, the work connects trajectory structure, KDE-based denoising, and DP-based scheduling to deliver efficient, principled acceleration for diffusion samplers.

Abstract

Diffusion-based generative models use stochastic differential equations (SDEs) and their equivalent ordinary differential equations (ODEs) to establish a smooth connection between a complex data distribution and a tractable prior distribution. In this paper, we identify several intriguing trajectory properties in the ODE-based sampling process of diffusion models. We characterize an implicit denoising trajectory and discuss its vital role in forming the coupled sampling trajectory with a strong shape regularity, regardless of the generated content. We also describe a dynamic programming-based scheme to make the time schedule in sampling better fit the underlying trajectory structure. This simple strategy requires minimal modification to any given ODE-based numerical solvers and incurs negligible computational cost, while delivering superior performance in image generation, especially in $5\sim 10$ function evaluations.

On the Trajectory Regularity of ODE-based Diffusion Sampling

TL;DR

The paper investigates the trajectory regularity of ODE-based diffusion sampling, revealing a boomerang-shaped, largely straight sampling path governed by an implicit denoising trajectory. By modeling data with kernel density estimation, it derives a closed-form denoising solution linked to annealed mean shift and demonstrates that the sampling length scales as with a nearly constant vector-field magnitude. Leveraging this regularity, it introduces Geometry-Inspired Time Scheduling (GITS), a dynamic-programming approach that re-allocates time steps using a small warmup, yielding substantial FID gains at few function evaluations with minimal overhead. The method provides theoretical and empirical insights into the sampling mechanism, offering a practical, fast route to higher-quality image generation on standard benchmarks. Overall, the work connects trajectory structure, KDE-based denoising, and DP-based scheduling to deliver efficient, principled acceleration for diffusion samplers.

Abstract

Diffusion-based generative models use stochastic differential equations (SDEs) and their equivalent ordinary differential equations (ODEs) to establish a smooth connection between a complex data distribution and a tractable prior distribution. In this paper, we identify several intriguing trajectory properties in the ODE-based sampling process of diffusion models. We characterize an implicit denoising trajectory and discuss its vital role in forming the coupled sampling trajectory with a strong shape regularity, regardless of the generated content. We also describe a dynamic programming-based scheme to make the time schedule in sampling better fit the underlying trajectory structure. This simple strategy requires minimal modification to any given ODE-based numerical solvers and incurs negligible computational cost, while delivering superior performance in image generation, especially in function evaluations.
Paper Structure (30 sections, 15 theorems, 52 equations, 18 figures, 10 tables, 1 algorithm)

This paper contains 30 sections, 15 theorems, 52 equations, 18 figures, 10 tables, 1 algorithm.

Key Result

Proposition 4.1

Given the probability flow ODE (eq:epf_ode) and a current position $\hat{\mathbf{x}}_{t_{n+1}}$, $n\in[0, N-1]$ in the sampling trajectory, the next position $\hat{\mathbf{x}}_{t_{n}}$ predicted by a $k$-order Taylor expansion with the time step size $\sigma_{t_{n+1}}-\sigma_{t_n}$ equals which is a convex combination of $\hat{\mathbf{x}}_{t_{n+1}}$ and the generalized denoising output $\mathcal{

Figures (18)

  • Figure 1: A geometric picture of ODE-based sampling in diffusion models. Each initial sample (from the noise distribution) starts from a big sphere and converges to the final sample (in the data manifold) along a regular sampling trajectory, which is controlled by an implicit denoising trajectory.
  • Figure 2: The sampling trajectory exhibits a very small trajectory deviation (red curve) compared to the sample distance (blue curve) in the sampling process starting from $t_{N}=80$ to $t_0=0.002$.
  • Figure 3: (a) We adopt $d$-dimensional vector $\hat{\mathbf{x}}_{t_N}-\hat{\mathbf{x}}_{t_0}$ and several top principal components (PCs) on its $(d-1)$-dimensional orthogonal complement to approximate the original $d$-dimensional sampling trajectory. (b) The visual comparison of trajectory reconstruction on Imagenet 64$\times$64. We reconstruct the real sampling trajectory (top row) using $\hat{\mathbf{x}}_{t_N}-\hat{\mathbf{x}}_{t_0}$ (1-D recon.) along with its top 1 or 2 principal components (2-D or 3-D recon.). To amplify the visual difference, we present the denoising outputs of these trajectories. (c) We calculate the $L^2$ distance between the real trajectory and the reconstructed trajectories up to 5-D reconstruction. (d) The variance explained by the top $k$ principal components. We report the ratio of the summation of the top $k$ eigenvalues to the summation of all eigenvalues.
  • Figure 4: We project 30 high-dimensional sampling trajectories generated on three different datasets into 3-D subspace. These trajectories are first aligned to the direction of $\hat{\mathbf{x}}_{t_N}-\hat{\mathbf{x}}_{t_0}$ (this direction is different for each sample), and then projected to the top 2 principal components in the orthogonal space to $\hat{\mathbf{x}}_{t_N}-\hat{\mathbf{x}}_{t_0}$. See texts for more details.
  • Figure 5: An illustration of two consecutive Euler steps, starting from a current sample $\hat{\mathbf{x}}_{t_{n+1}}$. A single Euler step in the ODE-based sampling is a convex combination of the denoising output and the current position to determine the next position. Blue points form a piecewise linear sampling trajectory, while red points form the denoising trajectory governing the rotation direction.
  • ...and 13 more figures

Theorems & Definitions (28)

  • Remark 2.1: Proofs in Section \ref{['subsec:equivalence']}
  • Proposition 4.1
  • Corollary 4.2
  • Corollary 4.3
  • Remark 1.1
  • Lemma 1.2
  • proof
  • Proposition 1.3
  • proof
  • Corollary 1.4
  • ...and 18 more