The Spacetime of Diffusion Models: An Information Geometry Perspective
Rafał Karczewski, Markus Heinonen, Alison Pouplin, Søren Hauberg, Vikas Garg
TL;DR
The paper tackles understanding the latent geometry of diffusion models by contrasting the conventional pullback approach with a Fisher–Rao information-geometry framework on denoising posteriors. By introducing a latent spacetime z = ({oldsymbol{x}}_t, t) and exploiting the exponential-family structure of p({oldsymbol{x}}_0|{oldsymbol{x}}_t), it derives tractable geodesics and defines the Diffusion Edit Distance as the minimal denoise-edit cost between data endpoints. The approach enables simulation-free geodesic computation and advances transition-path sampling in molecular systems, while providing principled mechanisms to constrain paths and handle noise-level transitions. Together, these contributions deepen the geometric understanding of diffusion models and offer new tools for editing and sampling in high-dimensional data spaces.
Abstract
We present a novel geometric perspective on the latent space of diffusion models. We first show that the standard pullback approach, utilizing the deterministic probability flow ODE decoder, is fundamentally flawed. It provably forces geodesics to decode as straight segments in data space, effectively ignoring any intrinsic data geometry beyond the ambient Euclidean space. Complementing this view, diffusion also admits a stochastic decoder via the reverse SDE, which enables an information geometric treatment with the Fisher-Rao metric. However, a choice of $x_T$ as the latent representation collapses this metric due to memorylessness. We address this by introducing a latent spacetime $z=(x_t,t)$ that indexes the family of denoising distributions $p(x_0 | x_t)$ across all noise scales, yielding a nontrivial geometric structure. We prove these distributions form an exponential family and derive simulation-free estimators for curve lengths, enabling efficient geodesic computation. The resulting structure induces a principled Diffusion Edit Distance, where geodesics trace minimal sequences of noise and denoise edits between data. We also demonstrate benefits for transition path sampling in molecular systems, including constrained variants such as low-variance transitions and region avoidance. Code is available at: https://github.com/rafalkarczewski/spacetime-geometry
