Table of Contents
Fetching ...

Decoder ensembling for learned latent geometries

Stas Syrota, Pablo Moreno-Muñoz, Søren Hauberg

TL;DR

This paper tackles the problem of topological mismatch between the data manifold and Euclidean latent spaces in learned latent geometries. It introduces decoder ensembles to capture model uncertainty and defines geodesics on the expected Riemannian metric $\mathbf{G} = \mathbb{E}_{q(\theta)}[\mathbf{J}_{f_{\theta}}^{\intercal} \mathbf{J}_{f_{\theta}}]$, enabling topology-aware latent interpolation. By discretizing geodesic energy and decorrelating ensemble predictions, the method yields stable geodesics that respect data support, outperforming RBF-based uncertainty models in retraining experiments on MNIST and FMNIST with varying latent dimensions. The approach is practical, scalable, and directly improves the reliability of latent geometries for downstream tasks such as interpolation and representation learning, with code provided for reproduction.

Abstract

Latent space geometry provides a rigorous and empirically valuable framework for interacting with the latent variables of deep generative models. This approach reinterprets Euclidean latent spaces as Riemannian through a pull-back metric, allowing for a standard differential geometric analysis of the latent space. Unfortunately, data manifolds are generally compact and easily disconnected or filled with holes, suggesting a topological mismatch to the Euclidean latent space. The most established solution to this mismatch is to let uncertainty be a proxy for topology, but in neural network models, this is often realized through crude heuristics that lack principle and generally do not scale to high-dimensional representations. We propose using ensembles of decoders to capture model uncertainty and show how to easily compute geodesics on the associated expected manifold. Empirically, we find this simple and reliable, thereby coming one step closer to easy-to-use latent geometries.

Decoder ensembling for learned latent geometries

TL;DR

This paper tackles the problem of topological mismatch between the data manifold and Euclidean latent spaces in learned latent geometries. It introduces decoder ensembles to capture model uncertainty and defines geodesics on the expected Riemannian metric , enabling topology-aware latent interpolation. By discretizing geodesic energy and decorrelating ensemble predictions, the method yields stable geodesics that respect data support, outperforming RBF-based uncertainty models in retraining experiments on MNIST and FMNIST with varying latent dimensions. The approach is practical, scalable, and directly improves the reliability of latent geometries for downstream tasks such as interpolation and representation learning, with code provided for reproduction.

Abstract

Latent space geometry provides a rigorous and empirically valuable framework for interacting with the latent variables of deep generative models. This approach reinterprets Euclidean latent spaces as Riemannian through a pull-back metric, allowing for a standard differential geometric analysis of the latent space. Unfortunately, data manifolds are generally compact and easily disconnected or filled with holes, suggesting a topological mismatch to the Euclidean latent space. The most established solution to this mismatch is to let uncertainty be a proxy for topology, but in neural network models, this is often realized through crude heuristics that lack principle and generally do not scale to high-dimensional representations. We propose using ensembles of decoders to capture model uncertainty and show how to easily compute geodesics on the associated expected manifold. Empirically, we find this simple and reliable, thereby coming one step closer to easy-to-use latent geometries.
Paper Structure (10 sections, 21 equations, 5 figures, 1 table)

This paper contains 10 sections, 21 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Shortest paths (geodesics) under the expected metric of a decoder following a Gaussian process. The topological hint of uncertainty is, thus, propagated to the metric. Figure is courtesy of hauberg:only:2018.
  • Figure 2: Using an ensemble of decoders ensures that regions of the latent space with limited data support have high uncertainty.
  • Figure 3: Upper row: Three examples of the latent space for ensembles of vae decoders on a reduced version of mnist data with three classes. Blue curves indicate the geodesic interpolants between two random latent coordinates. Lower row: Three examples of the latent space for a vae with rbf-generated uncertainties on mnist data with three classes.
  • Figure 4: The correction term of posterior covariances in a GP tends to be zero as $\Delta\mathbf{z}\gg 0$, even in areas of $\mathcal{Z}$ where is a high-density of training data.
  • Figure 5: Histogram of coefficients of variation for mnist and fmnist data with $d=2$ in the latent space $\mathcal{Z}$.