A Geometric Perspective on Diffusion Models
Defang Chen, Zhenyu Zhou, Jian-Ping Mei, Chunhua Shen, Chun Chen, Can Wang
TL;DR
This work provides a geometric lens on diffusion models, focusing on the variance-exploding SDE (VE-SDE) and its probability-flow ODE, to reveal two coupled trajectories that govern sampling: a quasi-linear sampling path connecting data and noise, and an implicit denoising trajectory that converges faster. It shows that second-order samplers arise as finite differences of the denoising trajectory and establishes a theoretical link between optimal ODE-based sampling and annealed mean shift, yielding monotone increases in sample likelihood under mild conditions. The analysis yields practical insights, including the ODE-Jump strategy, and explains why modest score deviation can preserve generative ability while mitigating mode collapse. By leveraging a change-of-variables view, the results extend to other SDE families and inform fast sampling, distillation-based methods, and latent interpolation. Overall, the paper deepens the understanding of diffusion dynamics and offers actionable directions for faster, more reliable generation.
Abstract
Recent years have witnessed significant progress in developing effective training and fast sampling techniques for diffusion models. A remarkable advancement is the use of stochastic differential equations (SDEs) and their marginal-preserving ordinary differential equations (ODEs) to describe data perturbation and generative modeling in a unified framework. In this paper, we carefully inspect the ODE-based sampling of a popular variance-exploding SDE and reveal several intriguing structures of its sampling dynamics. We discover that the data distribution and the noise distribution are smoothly connected with a quasi-linear sampling trajectory and another implicit denoising trajectory that even converges faster. Meanwhile, the denoising trajectory governs the curvature of the corresponding sampling trajectory and its finite differences yield various second-order samplers used in practice. Furthermore, we establish a theoretical relationship between the optimal ODE-based sampling and the classic mean-shift (mode-seeking) algorithm, with which we can characterize the asymptotic behavior of diffusion models and identify the empirical score deviation. Code is available at \url{https://github.com/zju-pi/diff-sampler}.
