Table of Contents
Fetching ...

Diffusion Maps is not Dimensionality Reduction

Julio Candanedo, Alejandro Patiño

Abstract

Diffusion maps (DMAP) are often used as a dimensionality-reduction tool, but more precisely they provide a spectral representation of the intrinsic geometry rather than a complete charting method. To illustrate this distinction, we study a Swiss roll with known isometric coordinates and compare DMAP, Isomap, and UMAP across latent dimensions. For each representation, we fit an oracle affine readout to the ground-truth chart and measure reconstruction error. Isomap most efficiently recovers the low-dimensional chart, UMAP provides an intermediate tradeoff, and DMAP becomes accurate only after combining multiple diffusion modes. Thus the correct chart lies in the span of diffusion coordinates, but standard DMAP do not by themselves identify the appropriate combination.

Diffusion Maps is not Dimensionality Reduction

Abstract

Diffusion maps (DMAP) are often used as a dimensionality-reduction tool, but more precisely they provide a spectral representation of the intrinsic geometry rather than a complete charting method. To illustrate this distinction, we study a Swiss roll with known isometric coordinates and compare DMAP, Isomap, and UMAP across latent dimensions. For each representation, we fit an oracle affine readout to the ground-truth chart and measure reconstruction error. Isomap most efficiently recovers the low-dimensional chart, UMAP provides an intermediate tradeoff, and DMAP becomes accurate only after combining multiple diffusion modes. Thus the correct chart lies in the span of diffusion coordinates, but standard DMAP do not by themselves identify the appropriate combination.

Paper Structure

This paper contains 9 sections, 13 equations, 5 figures.

Figures (5)

  • Figure 1: Above is a randomly sampled 2D sheet (left) and its perfect rolling in 3D (right).
  • Figure 2: Reconstruction error of the ground-truth sheet as a function of latent dimension $d$. IMAP compresses the geometry most efficiently at low dimension, UMAP provides an intermediate tradeoff, and DMAP requires more modes but ultimately yields the most accurate reconstruction.
  • Figure 3: Ground-truth sheet and affine reconstructions from IMAP, DMAP, and UMAP as the latent dimension $d$ increases. IMAP produces the correct sheet at small $d$, DMAP initially collapses to low-dimensional spectral modes before recovering the sheet at large $d$, and UMAP transitions between these behaviors.
  • Figure 4: DMAP readout spectra for the two ground-truth coordinates, showing the coefficient magnitudes $|L_{ni}|$ versus the diffusion spectral variable $1-\lambda_n$. Low values of $1-\lambda_n$ correspond to slow diffusion modes, while values near $1$ correspond to high-frequency modes. The broad low-amplitude background and irregular spikes are consistent with residual least-squares gauge freedom and numerical noise, rather than a unique geometric signal.
  • Figure 5: Two-dimensional charts formed by pairing DMAP mode 0 with modes 1--10. The figure shows that the unrolled sheet is not directly recovered from the first two diffusion modes; instead, sheet-like coordinates appear only for particular mode combinations, consistent with the spectrum in fig. \ref{['fig:dmap_spectra']} (mode 5), inline with the view of DMAP as a spectral basis rather than a canonical chart.