Shrink the longest: improving latent space isotropy with symplicial geometry
Sergei Kudriashov, Olesya Karpik, Eduard Klyshinsky
TL;DR
The paper addresses representation degeneration in transformer embeddings, where latent spaces become highly anisotropic during fine-tuning. It introduces a differentiable regularization signal based on the persistent entropy of 0D barcodes from Vietoris-Rips filtrations, leveraging simplicial geometry to improve isotropy while preserving existing clustering structures. Empirical results on MRPC and COLA show reduced anisotropy and occasional improvements in downstream metrics without requiring full reparameterization or additional inference overhead, with feature-selection of topological features playing a crucial role. Overall, the approach is model-agnostic, suitable for fine-tuning scenarios with limited data, and offers a practical pathway to healthier latent-space geometry in contextual embeddings.
Abstract
Although transformer-based models have been dominating the field of deep learning, various studies of their embedding space have shown that they suffer from "representation degeneration problem": embeddings tend to be distributed in a narrow cone, making the latent space highly anisotropic. Increasing the isotropy has shown to improve performance in downstream tasks both in static and contextual language models. However, most of approaches either add inference overhead or require substantial amount of data for model reparametrization. We propose a novel regularization technique based on simplicial geometry to improve the isotropy of latent representations. The core idea of our method is based on maximizing the persistent entropy of barcodes obtained using Vietoris-Rips filtration from contextual embeddings in the underlying latent space. We demonstrate that the method leads to an increase in downstream performance while significantly lowering the anisotropy during fine-tuning by exploiting existing geometric structures instead of reparametrization.
