Follow the Energy, Find the Path: Riemannian Metrics from Energy-Based Models
Louis Béthune, David Vigouroux, Yilun Du, Rufin VanRullen, Thomas Serre, Victor Boutin
TL;DR
This work addresses the challenge of shortest-path computation for high-dimensional data that lie on curved manifolds by deriving Riemannian metrics from pretrained Energy-Based Models (EBMs). It introduces two conformal metrics, $\mathbf{G}_{E_{\theta}}(\mathbf{x})$ and $\mathbf{G}_{1/p_{\theta}}(\mathbf{x})$, and uses a neural interpolant to approximate geodesics that stay close to the data manifold while respecting curvature. Across toy 2D mixtures, rotated character manifolds, and AFHQ latent spaces, the EBM-derived metrics consistently outperform baselines like $\mathbf{G}_{LAND}$ and $\mathbf{G}_{RBF}$ in terms of manifold alignment and geodesic fidelity, with the $\mathbf{G}_{E_{\theta}}$ variant often providing the strongest performance. By grounding geometry in a learned energy landscape, the approach enables scalable, data-aware geodesics that can improve generative modeling, trajectory planning, and cognitive-neuroscience interpretations of high-dimensional data geometry.
Abstract
What is the shortest path between two data points lying in a high-dimensional space? While the answer is trivial in Euclidean geometry, it becomes significantly more complex when the data lies on a curved manifold -- requiring a Riemannian metric to describe the space's local curvature. Estimating such a metric, however, remains a major challenge in high dimensions. In this work, we propose a method for deriving Riemannian metrics directly from pretrained Energy-Based Models (EBMs) -- a class of generative models that assign low energy to high-density regions. These metrics define spatially varying distances, enabling the computation of geodesics -- shortest paths that follow the data manifold's intrinsic geometry. We introduce two novel metrics derived from EBMs and show that they produce geodesics that remain closer to the data manifold and exhibit lower curvature distortion, as measured by alignment with ground-truth trajectories. We evaluate our approach on increasingly complex datasets: synthetic datasets with known data density, rotated character images with interpretable geometry, and high-resolution natural images embedded in a pretrained VAE latent space. Our results show that EBM-derived metrics consistently outperform established baselines, especially in high-dimensional settings. Our work is the first to derive Riemannian metrics from EBMs, enabling data-aware geodesics and unlocking scalable, geometry-driven learning for generative modeling and simulation.
