Table of Contents
Fetching ...

The Spacetime of Diffusion Models: An Information Geometry Perspective

Rafał Karczewski, Markus Heinonen, Alison Pouplin, Søren Hauberg, Vikas Garg

TL;DR

The paper tackles understanding the latent geometry of diffusion models by contrasting the conventional pullback approach with a Fisher–Rao information-geometry framework on denoising posteriors. By introducing a latent spacetime z = ({oldsymbol{x}}_t, t) and exploiting the exponential-family structure of p({oldsymbol{x}}_0|{oldsymbol{x}}_t), it derives tractable geodesics and defines the Diffusion Edit Distance as the minimal denoise-edit cost between data endpoints. The approach enables simulation-free geodesic computation and advances transition-path sampling in molecular systems, while providing principled mechanisms to constrain paths and handle noise-level transitions. Together, these contributions deepen the geometric understanding of diffusion models and offer new tools for editing and sampling in high-dimensional data spaces.

Abstract

We present a novel geometric perspective on the latent space of diffusion models. We first show that the standard pullback approach, utilizing the deterministic probability flow ODE decoder, is fundamentally flawed. It provably forces geodesics to decode as straight segments in data space, effectively ignoring any intrinsic data geometry beyond the ambient Euclidean space. Complementing this view, diffusion also admits a stochastic decoder via the reverse SDE, which enables an information geometric treatment with the Fisher-Rao metric. However, a choice of $x_T$ as the latent representation collapses this metric due to memorylessness. We address this by introducing a latent spacetime $z=(x_t,t)$ that indexes the family of denoising distributions $p(x_0 | x_t)$ across all noise scales, yielding a nontrivial geometric structure. We prove these distributions form an exponential family and derive simulation-free estimators for curve lengths, enabling efficient geodesic computation. The resulting structure induces a principled Diffusion Edit Distance, where geodesics trace minimal sequences of noise and denoise edits between data. We also demonstrate benefits for transition path sampling in molecular systems, including constrained variants such as low-variance transitions and region avoidance. Code is available at: https://github.com/rafalkarczewski/spacetime-geometry

The Spacetime of Diffusion Models: An Information Geometry Perspective

TL;DR

The paper tackles understanding the latent geometry of diffusion models by contrasting the conventional pullback approach with a Fisher–Rao information-geometry framework on denoising posteriors. By introducing a latent spacetime z = ({oldsymbol{x}}_t, t) and exploiting the exponential-family structure of p({oldsymbol{x}}_0|{oldsymbol{x}}_t), it derives tractable geodesics and defines the Diffusion Edit Distance as the minimal denoise-edit cost between data endpoints. The approach enables simulation-free geodesic computation and advances transition-path sampling in molecular systems, while providing principled mechanisms to constrain paths and handle noise-level transitions. Together, these contributions deepen the geometric understanding of diffusion models and offer new tools for editing and sampling in high-dimensional data spaces.

Abstract

We present a novel geometric perspective on the latent space of diffusion models. We first show that the standard pullback approach, utilizing the deterministic probability flow ODE decoder, is fundamentally flawed. It provably forces geodesics to decode as straight segments in data space, effectively ignoring any intrinsic data geometry beyond the ambient Euclidean space. Complementing this view, diffusion also admits a stochastic decoder via the reverse SDE, which enables an information geometric treatment with the Fisher-Rao metric. However, a choice of as the latent representation collapses this metric due to memorylessness. We address this by introducing a latent spacetime that indexes the family of denoising distributions across all noise scales, yielding a nontrivial geometric structure. We prove these distributions form an exponential family and derive simulation-free estimators for curve lengths, enabling efficient geodesic computation. The resulting structure induces a principled Diffusion Edit Distance, where geodesics trace minimal sequences of noise and denoise edits between data. We also demonstrate benefits for transition path sampling in molecular systems, including constrained variants such as low-variance transitions and region avoidance. Code is available at: https://github.com/rafalkarczewski/spacetime-geometry

Paper Structure

This paper contains 53 sections, 10 theorems, 77 equations, 8 figures, 1 table, 1 algorithm.

Key Result

Proposition 5.1

The energy of discretized spacetime curve ${\bm{\gamma}}=\{{\bm{z}}_n\}_{n=0}^{N-1}$ with ${\bm{z}}_n=({\bm{x}}_{t_n},t_n)$ admits an approximation where

Figures (8)

  • Figure 1: A geodesic in spacetime is the shortest path between denoising distributions.
  • Figure 2: The pullback geodesics curve in noise space, but decode to straight lines in data space.
  • Figure 3: PF-ODE paths are similar to energy-minimizing geodesics. Left: Geodesics move in straighter lines than PF-ODE trajectories in 1D toy density. Right: Geodesics are almost indistinguishable to PF-ODE sampling in ImageNet-512 EDM2 model.
  • Figure 4: Spacetime geodesics between images. Each row shows a geodesic ${\bm{\gamma}}$ between clean images. The path passes through noisy states and then denoises, realizing the minimal total edit between endpoints. Its length $\ell({\bm{\gamma}})$ is the Diffusion Edit Distance (DiffED), which measures how much the denoising distribution changes along the optimal traversal.
  • Figure 5: Spacetime geodesics enable sampling transition paths between low-energy states. Left: Alanine Dipeptide energy landscape wrt two dihedral angles, with two energy minima ${\bm{x}}_0^1, {\bm{x}}_0^2$. Middle: Spacetime geodesic ${\bm{\gamma}}$ connecting ${\bm{x}}_0^1$ and ${\bm{x}}_0^2$. Right: Annealed Langevin transition path samples.
  • ...and 3 more figures

Theorems & Definitions (20)

  • Proposition 5.1: Spacetime energy estimation - informal
  • Lemma B.1
  • proof
  • Proposition B.1: Pullback geodesics decode to straight lines
  • proof
  • Definition C.1: Exponential Family
  • Proposition C.1: Fisher-Rao metric for an exponential family
  • proof
  • Corollary C.1: Energy function for an exponential family
  • proof
  • ...and 10 more