Table of Contents
Fetching ...

UMAP Is Spectral Clustering on the Fuzzy Nearest-Neighbor Graph

Yang Yang

TL;DR

Problem: clarify the theoretical relationship between UMAP and spectral methods. Approach: prove that UMAP's negative-sampling SGD implements contrastive learning on the $k$-NN similarity graph and is equivalent to spectral clustering on that graph, with spectral initialization giving the exact linear solution. Main results: the equivalence is exact for the Gaussian kernel and a first-order approximation for the default Cauchy kernel. Practical impact: provides a unified spectral viewpoint that informs initialization, kernel choice, and graph construction for UMAP and related methods.

Abstract

UMAP (Uniform Manifold Approximation and Projection) is among the most widely used algorithms for non linear dimensionality reduction and data visualisation. Despite its popularity, and despite being presented through the lens of algebraic topology, the exact relationship between UMAP and classical spectral methods has remained informal. In this work, we prove that UMAP performs spectral clustering on the fuzzy k nearest neighbour graph. Our proof proceeds in three steps: (1) we show that UMAP's stochastic optimisation with negative sampling is a contrastive learning objective on the similarity graph; (2) we invoke the result of HaoChen et al. [8], establishing that contrastive learning on a similarity graph is equivalent to spectral clustering; and (3) we verify that UMAP's spectral initialisation computes the exact linear solution to this spectral problem. The equivalence is exact for Gaussian kernels, and holds as a first order approximation for UMAP's default Cauchy type kernel. Our result unifies UMAP, contrastive learning, and spectral clustering under a single framework, and provides theoretical grounding for several empirical observations about UMAP's behaviour.

UMAP Is Spectral Clustering on the Fuzzy Nearest-Neighbor Graph

TL;DR

Problem: clarify the theoretical relationship between UMAP and spectral methods. Approach: prove that UMAP's negative-sampling SGD implements contrastive learning on the -NN similarity graph and is equivalent to spectral clustering on that graph, with spectral initialization giving the exact linear solution. Main results: the equivalence is exact for the Gaussian kernel and a first-order approximation for the default Cauchy kernel. Practical impact: provides a unified spectral viewpoint that informs initialization, kernel choice, and graph construction for UMAP and related methods.

Abstract

UMAP (Uniform Manifold Approximation and Projection) is among the most widely used algorithms for non linear dimensionality reduction and data visualisation. Despite its popularity, and despite being presented through the lens of algebraic topology, the exact relationship between UMAP and classical spectral methods has remained informal. In this work, we prove that UMAP performs spectral clustering on the fuzzy k nearest neighbour graph. Our proof proceeds in three steps: (1) we show that UMAP's stochastic optimisation with negative sampling is a contrastive learning objective on the similarity graph; (2) we invoke the result of HaoChen et al. [8], establishing that contrastive learning on a similarity graph is equivalent to spectral clustering; and (3) we verify that UMAP's spectral initialisation computes the exact linear solution to this spectral problem. The equivalence is exact for Gaussian kernels, and holds as a first order approximation for UMAP's default Cauchy type kernel. Our result unifies UMAP, contrastive learning, and spectral clustering under a single framework, and provides theoretical grounding for several empirical observations about UMAP's behaviour.
Paper Structure (29 sections, 3 theorems, 23 equations, 1 table)

This paper contains 29 sections, 3 theorems, 23 equations, 1 table.

Key Result

Theorem 2.3

Let $\pi$ be a similarity graph defined by data augmentation, $Z = f(X)$ the learned representations, and $k$ the embedding kernel. Then the contrastive (InfoNCE) loss is equivalent to: where $R(Z) = \prod_i \sum_{j \neq i}k(Z_i - Z_j)$. For the Gaussian kernel $k(z) = \exp(-\|z\|^2/2\tau)$, this becomes: which is spectral clustering on $\pi$ (def:spectral_clustering).

Theorems & Definitions (7)

  • Definition 2.1: Graph Laplacian
  • Definition 2.2: Spectral Clustering vonluxburg2007tutorial
  • Theorem 2.3: haochen2021provable, Theorem 3.1
  • Theorem 3.1: UMAP Is Spectral Clustering
  • Remark 3.2
  • Lemma A.1
  • proof