Table of Contents
Fetching ...

The Information Geometry of UMAP

Alexander Kolpakov, Aidan Rocke

TL;DR

This paper reframes UMAP through the lens of Information Geometry, linking its local probabilistic construction to geometric notions on manifolds and KL-based divergences. It clarifies how conformal rescaling and a uniformity assumption underpin the high- to low-dimensional embedding via a cross-entropy (equivalently KL) objective, and discusses the role of probabilistic kNN-graphs and kernel choices in shaping the learned geometry. It also proposes topological extensions using Vietoris–Rips complexes to capture multi-scale structure and persistence, potentially enriching the embedding with topological guarantees. The work provides a principled theoretical foundation for UMAP, highlighting its connections to Fisher metrics and suggesting practical avenues for incorporating topology into manifold learning.

Abstract

In this note we highlight some connections of UMAP to the basic principles of Information Geometry. Originally, UMAP was derived from Category Theory observations. However, we posit that it also has a natural geometric interpretation.

The Information Geometry of UMAP

TL;DR

This paper reframes UMAP through the lens of Information Geometry, linking its local probabilistic construction to geometric notions on manifolds and KL-based divergences. It clarifies how conformal rescaling and a uniformity assumption underpin the high- to low-dimensional embedding via a cross-entropy (equivalently KL) objective, and discusses the role of probabilistic kNN-graphs and kernel choices in shaping the learned geometry. It also proposes topological extensions using Vietoris–Rips complexes to capture multi-scale structure and persistence, potentially enriching the embedding with topological guarantees. The work provides a principled theoretical foundation for UMAP, highlighting its connections to Fisher metrics and suggesting practical avenues for incorporating topology into manifold learning.

Abstract

In this note we highlight some connections of UMAP to the basic principles of Information Geometry. Originally, UMAP was derived from Category Theory observations. However, we posit that it also has a natural geometric interpretation.
Paper Structure (17 sections, 27 equations, 2 figures, 3 tables)

This paper contains 17 sections, 27 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: The uniform point distribution that follows from the Veselov--Shabat KdV equation: both the $3$D projection (left) and the UMAP embedding (right) resemble $2$D surfaces.
  • Figure 2: A non--uniform distribution: the projection still looks like a $2$D surface, but the UMAP embedding is essentially $1$D.