Inductive Global and Local Manifold Approximation and Projection
Jungeum Kim, Xiao Wang
TL;DR
GLoMAP unifies global and local manifold learning by constructing a locally adaptive global distance from \\hat{d}_{loc} using KNN scales and then merging via shortest-path on a graph, with a tempering schedule that reveals global structure before local details. The inductive variant iGLoMAP adds a mapper $Q_\theta$ to produce embeddings for unseen data, trained with a particle-based scheme that preserves transductive stability while enabling generalization. Theoretical results guarantee the local distance estimator is consistent with the geodesic distance on a manifold, and the global metric space can be formed as an extended metric via a coequalizer-like construction. Empirically, GLoMAP and iGLoMAP achieve competitive performance on simulated and real datasets (e.g., MNIST, Spheres, hierarchical data), demonstrating strong global-to-local structure preservation and scalable inductive embeddings for large-scale DR tasks.
Abstract
Nonlinear dimensional reduction with the manifold assumption, often called manifold learning, has proven its usefulness in a wide range of high-dimensional data analysis. The significant impact of t-SNE and UMAP has catalyzed intense research interest, seeking further innovations toward visualizing not only the local but also the global structure information of the data. Moreover, there have been consistent efforts toward generalizable dimensional reduction that handles unseen data. In this paper, we first propose GLoMAP, a novel manifold learning method for dimensional reduction and high-dimensional data visualization. GLoMAP preserves locally and globally meaningful distance estimates and displays a progression from global to local formation during the course of optimization. Furthermore, we extend GLoMAP to its inductive version, iGLoMAP, which utilizes a deep neural network to map data to its lower-dimensional representation. This allows iGLoMAP to provide lower-dimensional embeddings for unseen points without needing to re-train the algorithm. iGLoMAP is also well-suited for mini-batch learning, enabling large-scale, accelerated gradient calculations. We have successfully applied both GLoMAP and iGLoMAP to the simulated and real-data settings, with competitive experiments against the state-of-the-art methods.
