Nonequilbrium physics of generative diffusion models
Zhendong Yu, Haiping Huang
TL;DR
This paper reframes generative diffusion models (GDMs) as a nonequilibrium physics problem, treating forward diffusion as an Ornstein–Uhlenbeck Langevin process and the reverse generative step as a statistical-inference-driven dynamics with the score function guiding denoising. Using a path-integral perspective, it derives fluctuation theorems and entropy production for both forward and reverse processes, and introduces a potential/free-energy framework to characterize phase transitions in the reverse dynamics, including a speciation transition and a glass-like fragmentation analyzed via the Franz–Parisi potential. The work provides analytic results in a Gaussian-mixture data setting, revealing that reverse diffusion can be viewed as minimizing a generalized free energy and that the dynamic-state variable acts as quenched disorder akin to spin-glass systems. By linking stochastic thermodynamics, statistical inference, and geometry-based methods, the paper offers a coherent theoretical picture of GDMs with implications for understanding sampling dynamics and guiding the design of diffusion-based generative models.
Abstract
Generative diffusion models apply the concept of Langevin dynamics in physics to machine leaning, attracting a lot of interests from engineering, statistics and physics, but a complete picture about inherent mechanisms is still lacking. In this paper, we provide a transparent physics analysis of diffusion models, formulating the fluctuation theorem, entropy production, equilibrium measure, and Franz-Parisi potential to understand the dynamic process and intrinsic phase transitions. Our analysis is rooted in a path integral representation of both forward and backward dynamics, and in treating the reverse diffusion generative process as a statistical inference, where the time-dependent state variables serve as quenched disorder akin to that in spin glass theory. Our study thus links stochastic thermodynamics, statistical inference and geometry based analysis together to yield a coherent picture about how the generative diffusion models work.
