Table of Contents
Fetching ...

IsUMap: Manifold Learning and Data Visualization leveraging Vietoris-Rips filtrations

Lukas Silvester Barth, Fatemeh, Fahimi, Parvaneh Joharinad, Jürgen Jost, Janis Keck

TL;DR

IsUMap addresses the challenge of producing informative, geometry-faithful low-dimensional representations for data with non-uniform distributions by merging locally distorted metrics derived from Vietoris-Rips filtrations into a global intrinsic metric via metric realization. It blends Isomap and UMAP within the framework of uber metric spaces, using star-graph local metrics, $t$-conorm merging, and shortest-path completion to generate a unified distance and a low-dimensional embedding via MDS. The work provides both theoretical foundations (weighted simplicial complexes, metric realization, and category-theoretic framing) and extensive empirical demonstrations across synthetic manifolds, image data, non-uniform hemispherical distributions, knotted proteins, and RNA-velocity trajectories, illustrating improved geometric fidelity and topological structure preservation. The approach offers a robust, interpretable pipeline for visualization and downstream tasks in scenarios with density variation and complex local geometry.

Abstract

This work introduces IsUMap, a novel manifold learning technique that enhances data representation by integrating aspects of UMAP and Isomap with Vietoris-Rips filtrations. We present a systematic and detailed construction of a metric representation for locally distorted metric spaces that captures complex data structures more accurately than the previous schemes. Our approach addresses limitations in existing methods by accommodating non-uniform data distributions and intricate local geometries. We validate its performance through extensive experiments on examples of various geometric objects and benchmark real-world datasets, demonstrating significant improvements in representation quality.

IsUMap: Manifold Learning and Data Visualization leveraging Vietoris-Rips filtrations

TL;DR

IsUMap addresses the challenge of producing informative, geometry-faithful low-dimensional representations for data with non-uniform distributions by merging locally distorted metrics derived from Vietoris-Rips filtrations into a global intrinsic metric via metric realization. It blends Isomap and UMAP within the framework of uber metric spaces, using star-graph local metrics, -conorm merging, and shortest-path completion to generate a unified distance and a low-dimensional embedding via MDS. The work provides both theoretical foundations (weighted simplicial complexes, metric realization, and category-theoretic framing) and extensive empirical demonstrations across synthetic manifolds, image data, non-uniform hemispherical distributions, knotted proteins, and RNA-velocity trajectories, illustrating improved geometric fidelity and topological structure preservation. The approach offers a robust, interpretable pipeline for visualization and downstream tasks in scenarios with density variation and complex local geometry.

Abstract

This work introduces IsUMap, a novel manifold learning technique that enhances data representation by integrating aspects of UMAP and Isomap with Vietoris-Rips filtrations. We present a systematic and detailed construction of a metric representation for locally distorted metric spaces that captures complex data structures more accurately than the previous schemes. Our approach addresses limitations in existing methods by accommodating non-uniform data distributions and intricate local geometries. We validate its performance through extensive experiments on examples of various geometric objects and benchmark real-world datasets, demonstrating significant improvements in representation quality.
Paper Structure (13 sections, 4 theorems, 13 equations, 7 figures, 5 tables)

This paper contains 13 sections, 4 theorems, 13 equations, 7 figures, 5 tables.

Key Result

Theorem 1.1

let $(M,g)$ be a Riemannian manifold that is compact (or more generally, satisfies some technical condition that essentially amounts to a positive lower bound on the injectivity radius), and $VR(M,r)$ be the Vietoris-Rips complex at scale $r$, corresponding to the metric $d$ defined by the Riemannia

Figures (7)

  • Figure 6.1: MNIST dataset (data size$=10000$, dim$=28\times 28$, $k=20$) (a) IsUMap, (b) Isomap, (c) UMAP. Wisconsin breast cancer datasets (data size $=570$, dim$=32$, $k=20$) (d) IsUMap, (e) Isomap, (f) UMAP.
  • Figure 6.2: Clustering performance in MNIST dataset after dimensionality reduction by (a)IsUMap, (b)Isomap and (c) UMAP.
  • Figure 6.3: visualization of a sample of size $10000$ generated on Hemisphere with non-uniform distribution in dimension $2$ with $k=30$, (a) Data set, (b) IsUMap, (c)Isomap, (d)UMAP.
  • Figure 6.4: visualization of a sample of size $10000$ generated on Hemisphere with non-uniform distribution in dimension $2$ by IsUMap (first row) and UMAP (second row) with $k=30$ various t-conorms: Algebraic sum (a,e), Canonical (b,f), Bounded sum (c,g), Drastic sum (d,h).
  • Figure 6.5: visualization of a sample of size $10000$ generated on Hemisphere with non-uniform distribution in dimension $2$ by IsUMap without subtracting $\rho_i$ (with $k=30$) with various t-conorms: Algebraic sum (a), Canonical (b), Bounded sum (c), Drastic sum (d).
  • ...and 2 more figures

Theorems & Definitions (17)

  • Definition 1.1
  • Definition 1.2
  • Theorem 1.1
  • Definition 1.3
  • Example 1
  • Definition 2.1
  • Remark
  • Definition 2.2
  • Lemma 1
  • Proof
  • ...and 7 more