Decoder ensembling for learned latent geometries
Stas Syrota, Pablo Moreno-Muñoz, Søren Hauberg
TL;DR
This paper tackles the problem of topological mismatch between the data manifold and Euclidean latent spaces in learned latent geometries. It introduces decoder ensembles to capture model uncertainty and defines geodesics on the expected Riemannian metric $\mathbf{G} = \mathbb{E}_{q(\theta)}[\mathbf{J}_{f_{\theta}}^{\intercal} \mathbf{J}_{f_{\theta}}]$, enabling topology-aware latent interpolation. By discretizing geodesic energy and decorrelating ensemble predictions, the method yields stable geodesics that respect data support, outperforming RBF-based uncertainty models in retraining experiments on MNIST and FMNIST with varying latent dimensions. The approach is practical, scalable, and directly improves the reliability of latent geometries for downstream tasks such as interpolation and representation learning, with code provided for reproduction.
Abstract
Latent space geometry provides a rigorous and empirically valuable framework for interacting with the latent variables of deep generative models. This approach reinterprets Euclidean latent spaces as Riemannian through a pull-back metric, allowing for a standard differential geometric analysis of the latent space. Unfortunately, data manifolds are generally compact and easily disconnected or filled with holes, suggesting a topological mismatch to the Euclidean latent space. The most established solution to this mismatch is to let uncertainty be a proxy for topology, but in neural network models, this is often realized through crude heuristics that lack principle and generally do not scale to high-dimensional representations. We propose using ensembles of decoders to capture model uncertainty and show how to easily compute geodesics on the associated expected manifold. Empirically, we find this simple and reliable, thereby coming one step closer to easy-to-use latent geometries.
