Relating tSNE and UMAP to Classical Dimensionality Reduction
Andrew Draganov, Simon Dohn
TL;DR
This work investigates the explainability gap of gradient-based DR methods (tSNE/UMAP) by connecting them to classical techniques. It formalizes an attraction/repulsion (ARDR) framework and shows that PCA, MDS, and Isomap can be recovered within this paradigm by applying attractions/repulsions on a randomly initialized dataset, with PCA gradient expressed as $ abla^{PCA} = -4 C(G_X - G_Y) C Y$. The authors further demonstrate that UMAP can be reproduced using classical DR via double-kernel LLE (DK-LLE) objectives (DK-LLE with two kernels) and prove strong empirical and theoretical links between UMAP and DK-LLE, including a shared neighborhood preservation behavior. They propose that UMAP embeddings implicitly preserve local neighborhoods under the input/output kernels, and they provide a concrete conjecture that UMAP achieves a constant-factor approximation to the DK-LLE objective, offering a pathway to interpretable explanations for UMAP outputs. The work highlights practical implications for interpreting embeddings and suggests future directions to relate modern ARDR methods to classical, explainable DR techniques, potentially enabling rigorous guarantees for the high-dimensional structures they summarize.
Abstract
It has become standard to use gradient-based dimensionality reduction (DR) methods like tSNE and UMAP when explaining what AI models have learned. This makes sense: these methods are fast, robust, and have an uncanny ability to find semantic patterns in high-dimensional data without supervision. Despite this, gradient-based DR methods lack the most important quality that an explainability method should possess: themselves being explainable. That is, given a UMAP output, it is currently unclear what one can say about the corresponding input. We work towards closing this question by relating UMAP to classical DR techniques. Specifically, we show that one can fully recover methods like PCA, MDS, and ISOMAP in the modern DR paradigm: by applying attractions and repulsions onto a randomly initialized dataset. We also show that, with a small change, Locally Linear Embeddings (LLE) can indistinguishably reproduce UMAP outputs. This implies that the UMAP effective objective is minimized by this modified version of LLE (and vice versa). Given this, we discuss what must be true of UMAP emebddings and present avenues for future work.
