Table of Contents
Fetching ...

Mapping the Multiverse of Latent Representations

Jeremy Wayland, Corinna Coupette, Bastian Rieck

TL;DR

Addressing reliability and robustness in latent-space ML, the paper treats representational variability across models, hyperparameters, and datasets as a multiverse. It introduces PRESTO, a topological multiverse framework that uses persistent homology to map embeddings, via four steps: embed data, project embeddings, compute persistence diagrams, and vectorize them into persistence landscapes. It defines $PD$ (Presto Distance) and $PV$ (Presto Variance) to quantify pairwise similarity and variability of latent spaces, with stability guarantees under projection. Through experiments on VAEs and transformers, PRESTO reveals distinct topological structure in latent spaces, enables sensitivity analysis and search-space compression, and supports cross-dataset transfer insights for robust model selection.

Abstract

Echoing recent calls to counter reliability and robustness concerns in machine learning via multiverse analysis, we present PRESTO, a principled framework for mapping the multiverse of machine-learning models that rely on latent representations. Although such models enjoy widespread adoption, the variability in their embeddings remains poorly understood, resulting in unnecessary complexity and untrustworthy representations. Our framework uses persistent homology to characterize the latent spaces arising from different combinations of diverse machine-learning methods, (hyper)parameter configurations, and datasets, allowing us to measure their pairwise (dis)similarity and statistically reason about their distributions. As we demonstrate both theoretically and empirically, our pipeline preserves desirable properties of collections of latent representations, and it can be leveraged to perform sensitivity analysis, detect anomalous embeddings, or efficiently and effectively navigate hyperparameter search spaces.

Mapping the Multiverse of Latent Representations

TL;DR

Addressing reliability and robustness in latent-space ML, the paper treats representational variability across models, hyperparameters, and datasets as a multiverse. It introduces PRESTO, a topological multiverse framework that uses persistent homology to map embeddings, via four steps: embed data, project embeddings, compute persistence diagrams, and vectorize them into persistence landscapes. It defines (Presto Distance) and (Presto Variance) to quantify pairwise similarity and variability of latent spaces, with stability guarantees under projection. Through experiments on VAEs and transformers, PRESTO reveals distinct topological structure in latent spaces, enables sensitivity analysis and search-space compression, and supports cross-dataset transfer insights for robust model selection.

Abstract

Echoing recent calls to counter reliability and robustness concerns in machine learning via multiverse analysis, we present PRESTO, a principled framework for mapping the multiverse of machine-learning models that rely on latent representations. Although such models enjoy widespread adoption, the variability in their embeddings remains poorly understood, resulting in unnecessary complexity and untrustworthy representations. Our framework uses persistent homology to characterize the latent spaces arising from different combinations of diverse machine-learning methods, (hyper)parameter configurations, and datasets, allowing us to measure their pairwise (dis)similarity and statistically reason about their distributions. As we demonstrate both theoretically and empirically, our pipeline preserves desirable properties of collections of latent representations, and it can be leveraged to perform sensitivity analysis, detect anomalous embeddings, or efficiently and effectively navigate hyperparameter search spaces.
Paper Structure (41 sections, 9 theorems, 32 equations, 21 figures, 12 tables)

This paper contains 41 sections, 9 theorems, 32 equations, 21 figures, 12 tables.

Key Result

Theorem 3.8

Given an MMS $\mathfrak{M}$ and an associated PMMS $\mathfrak{M}\xspace^k\xspace$ with topological loss $\ell^k$, we can bound the pairwise-distance perturbation under projection as $\mathfrak{M}\xspace^k\xspace[i,j] \leq \mathfrak{M}\xspace[i,j] + 2\ell^k\xspace\;.$

Figures (21)

  • Figure 1: Entangled disentanglement. The embedding spaces of DAEs cha_orthogonality-enforced_2023 vary widely when we change the learning rate LR and the batch-normalization hyperparameter $\alpha$ of the model, and the latent structure of the XYC dataset (shapes varying in 2D coordinates and color) is properly disentangled only with the right parameter choices (). Presto can topologically assess the (hyper)parameter sensitivity of latent-space models.
  • Figure 2: The Presto pipeline. For each model $M\xspace_i$ in our multiverse $\mathcal{M}$, Presto computes the persistent homology associated with the embedding of a dataset $X\xspace_i$ generated by $M\xspace_i$, yielding a set of embeddings $\mathcal{E}$. Thus enabled to compare the latent spaces of different models via the landscape distance of their persistence landscapes, with Presto, we can cluster, compress, detect outliers in, and analyze the sensitivity of (hyper)parameter configurations.
  • Figure 3: Comparing Presto distances with other measures. We show the Pearson correlations between Presto, RTD, $k$CKA, and $l$CKA, on random data (left), VAE embeddings (center), and LLM embeddings (right). Presto captures representational variation differently from existing methods.
  • Figure 4: Comparing Presto, latent-space geometry, and model performance. We show the distribution of correlations between Presto distances and geometric latent-space distances in the VAE multiverse, estimating geometric distances based on the Pearson distance between Euclidean metric spaces of aligned random samples of size $512$ over $256$ random draws (left), as well as the relationship between landscape norms and model performance for $\beta$-VAE (right). Presto captures geometric similarity between latent spaces and is orthogonal to performance.
  • Figure 5: Landscape-norm distributions in our VAE hyperparameter multiverse. We show the distribution of landscape norms after initialization (left) and training (center), as well as the distribution of Presto distances between the landscape at initialization and the landscape after training (right). Thick lines indicate means, thin black lines indicate interquartile range, and black dots indicate outliers. Training differentially affects landscape norms across models and datasets.
  • ...and 16 more figures

Theorems & Definitions (25)

  • Definition 3.1: Latent-Space Multiverse
  • Definition 3.2: Presto Distance [PD]
  • Definition 3.3: Presto Variance [PV]
  • Definition 3.4: Presto Sensitivity [PS]
  • Definition 3.5: Multiverse Metric Space $\mathfrak{M}$ [MMS]
  • Definition 3.6: Projected Multiverse Metric Space $\mathfrak{M}\xspace^k\xspace$ [PMMS]
  • Definition 3.7: Topological Loss
  • Theorem 3.8: Metric-Space Preservation under Projection
  • Theorem 3.9: Variance under Projection
  • Definition 1.1: Bottleneck Distance $d_B$
  • ...and 15 more