Metric Space Magnitude for Evaluating the Diversity of Latent Representations
Katharina Limbeck, Rayna Andreeva, Rik Sarkar, Bastian Rieck
TL;DR
This work addresses the challenge of evaluating intrinsic diversity of latent representations without relying on ground-truth distributions. It introduces metric-space magnitude and its multi-scale variants MagArea and MagDiff to produce robust, scale-aware diversity summaries across text, image, and graph embeddings. The authors establish axiomatic advantages, provide efficient computational strategies (notably via Cholesky factorization), and validate the approach with extensive experiments showing superior performance over traditional diversity metrics and reliable mode-collapse/dropping detection. The results suggest magnitude-based diversity offers a principled, scalable tool for model evaluation and comparison in representation learning, with practical implications for debugging and regularization in generative and embedding-based systems.
Abstract
The magnitude of a metric space is a novel invariant that provides a measure of the 'effective size' of a space across multiple scales, while also capturing numerous geometrical properties, such as curvature, density, or entropy. We develop a family of magnitude-based measures of the intrinsic diversity of latent representations, formalising a novel notion of dissimilarity between magnitude functions of finite metric spaces. Our measures are provably stable under perturbations of the data, can be efficiently calculated, and enable a rigorous multi-scale characterisation and comparison of latent representations. We show their utility and superior performance across different domains and tasks, including (i) the automated estimation of diversity, (ii) the detection of mode collapse, and (iii) the evaluation of generative models for text, image, and graph data.
