Table of Contents
Fetching ...

Geodesic Calculus on Latent Spaces

Florine Hartwig, Josua Sassen, Juliane Braunsmann, Martin Rumpf, Benedikt Wirth

TL;DR

The paper tackles the challenge of performing meaningful geometric operations on latent spaces of autoencoders, which are typically implicit and lack explicit manifold structure. It proposes describing latent representations as implicit submanifolds $\mathcal{Z}=\{z: \zeta(z)=0\}$ with a learned projection $\Pi_\sigma$ (via a denoising objective) to obtain a robust implicit representation and allow Riemannian calculus. A time-discrete geodesic calculus is developed and implemented with an augmented Lagrangian approach to compute discrete geodesics and discrete exponential maps on $\mathcal{Z}$ under various metrics, enabling geodesic interpolation and extrapolation directly in latent space. The framework is validated across multiple data modalities, including discrete shells, motion capture with spherical VAEs, and image data, showing improved interpolation behavior and plausible latent-geodesic paths when decoded. This work thus enables practical, geometry-aware manipulation of latent representations, with potential to extend to distance-based and probabilistic latent models and to richer geometric constructions such as parallel transport and curvature.

Abstract

Latent manifolds of autoencoders provide low-dimensional representations of data, which can be studied from a geometric perspective. We propose to describe these latent manifolds as implicit submanifolds of some ambient latent space. Based on this, we develop tools for a discrete Riemannian calculus approximating classical geometric operators. These tools are robust against inaccuracies of the implicit representation often occurring in practical examples. To obtain a suitable implicit representation, we propose to learn an approximate projection onto the latent manifold by minimizing a denoising objective. This approach is independent of the underlying autoencoder and supports the use of different Riemannian geometries on the latent manifolds. The framework in particular enables the computation of geodesic paths connecting given end points and shooting geodesics via the Riemannian exponential maps on latent manifolds. We evaluate our approach on various autoencoders trained on synthetic and real data.

Geodesic Calculus on Latent Spaces

TL;DR

The paper tackles the challenge of performing meaningful geometric operations on latent spaces of autoencoders, which are typically implicit and lack explicit manifold structure. It proposes describing latent representations as implicit submanifolds with a learned projection (via a denoising objective) to obtain a robust implicit representation and allow Riemannian calculus. A time-discrete geodesic calculus is developed and implemented with an augmented Lagrangian approach to compute discrete geodesics and discrete exponential maps on under various metrics, enabling geodesic interpolation and extrapolation directly in latent space. The framework is validated across multiple data modalities, including discrete shells, motion capture with spherical VAEs, and image data, showing improved interpolation behavior and plausible latent-geodesic paths when decoded. This work thus enables practical, geometry-aware manipulation of latent representations, with potential to extend to distance-based and probabilistic latent models and to richer geometric constructions such as parallel transport and curvature.

Abstract

Latent manifolds of autoencoders provide low-dimensional representations of data, which can be studied from a geometric perspective. We propose to describe these latent manifolds as implicit submanifolds of some ambient latent space. Based on this, we develop tools for a discrete Riemannian calculus approximating classical geometric operators. These tools are robust against inaccuracies of the implicit representation often occurring in practical examples. To obtain a suitable implicit representation, we propose to learn an approximate projection onto the latent manifold by minimizing a denoising objective. This approach is independent of the underlying autoencoder and supports the use of different Riemannian geometries on the latent manifolds. The framework in particular enables the computation of geodesic paths connecting given end points and shooting geodesics via the Riemannian exponential maps on latent manifolds. We evaluate our approach on various autoencoders trained on synthetic and real data.

Paper Structure

This paper contains 42 sections, 11 equations, 13 figures, 1 algorithm.

Figures (13)

  • Figure 1: Training an autoencoder $(\phi,\psi)$ with data ${\mathcal{X}}$ lying on a manifold ${\mathcal{M}}$ yields a low-dimensional latent manifold ${\mathcal{Z}}$. Using a denoising objective, we learn an implicit representation $\zeta$ of this manifold (color-coding on the 2D slice from blue to yellow indicates $|\zeta|$) based on a projection $\Pi$ onto ${\mathcal{Z}}$ (white arrows, rescaled). Furthermore, we introduce a practical geodesic calculus on this representation, enabling, e.g., shape interpolation using latent manifolds (green dots and shapes).
  • Figure 2: Discrete geodesics for different values of $K$ computed on a torus with learned implicit manifold representation $\zeta_\sigma$ (green points) and highly resolved geodesic computed with ground truth representation $\zeta$ (black line).
  • Figure 3: Comparison between computed exponentials with learned implicit manifold representation $\zeta_\sigma$ (green points) and ground truth representation $\zeta$ (black line). As in any dynamical system, slight numerical inaccuracies lead to an exponentially growing divergence (which is known to be more pronounced in regions of negative curvature as in the right-most example).
  • Figure 4: Interpolations on a learned submanifold of the shape space of discrete shells. Comparison between linear interpolation in latent space (red) and geodesic interpolation using a learned implicit representation (green) of the latent manifold ${\mathcal{Z}}$.
  • Figure 5: Left: Visualization of sample points in latent space (projected from $\mathbb{R}^{10}$ into $\mathbb{R}^3$ based on a PCA) and linear interpolation (red), geodesic interpolation with $\mathcal{W}_{\text{E}}$ (yellow), and geodesic interpolation with $\mathcal{W}_{{\mathcal{M}}}$ (green). Right: Corresponding decoded sequences.
  • ...and 8 more figures