The Information Geometry of UMAP
Alexander Kolpakov, Aidan Rocke
TL;DR
This paper reframes UMAP through the lens of Information Geometry, linking its local probabilistic construction to geometric notions on manifolds and KL-based divergences. It clarifies how conformal rescaling and a uniformity assumption underpin the high- to low-dimensional embedding via a cross-entropy (equivalently KL) objective, and discusses the role of probabilistic kNN-graphs and kernel choices in shaping the learned geometry. It also proposes topological extensions using Vietoris–Rips complexes to capture multi-scale structure and persistence, potentially enriching the embedding with topological guarantees. The work provides a principled theoretical foundation for UMAP, highlighting its connections to Fisher metrics and suggesting practical avenues for incorporating topology into manifold learning.
Abstract
In this note we highlight some connections of UMAP to the basic principles of Information Geometry. Originally, UMAP was derived from Category Theory observations. However, we posit that it also has a natural geometric interpretation.
