Table of Contents
Fetching ...

A Spectral Framework for Multi-Scale Nonlinear Dimensionality Reduction

Zeyang Huang, Angelos Chatzimparmpas, Thomas Höllt, Takanori Fujiwara

Abstract

Dimensionality reduction (DR) is characterized by two longstanding trade-offs. First, there is a global-local preservation tension: methods such as t-SNE and UMAP prioritize local neighborhood preservation, yet may distort global manifold structure, while methods such as Laplacian Eigenmaps preserve global geometry but often yield limited local separation. Second, there is a gap between expressiveness and analytical transparency: many nonlinear DR methods produce embeddings without an explicit connection to the underlying high-dimensional structure, limiting insight into the embedding process. In this paper, we introduce a spectral framework for nonlinear DR that addresses these challenges. Our approach embeds high-dimensional data using a spectral basis combined with cross-entropy optimization, enabling multi-scale representations that bridge global and local structure. Leveraging linear spectral decomposition, the framework further supports analysis of embeddings through a graph-frequency perspective, enabling examination of how spectral modes influence the resulting embedding. We complement this analysis with glyph-based scatterplot augmentations for visual exploration. Quantitative evaluations and case studies demonstrate that our framework improves manifold continuity while enabling deeper analysis of embedding structure through spectral mode contributions.

A Spectral Framework for Multi-Scale Nonlinear Dimensionality Reduction

Abstract

Dimensionality reduction (DR) is characterized by two longstanding trade-offs. First, there is a global-local preservation tension: methods such as t-SNE and UMAP prioritize local neighborhood preservation, yet may distort global manifold structure, while methods such as Laplacian Eigenmaps preserve global geometry but often yield limited local separation. Second, there is a gap between expressiveness and analytical transparency: many nonlinear DR methods produce embeddings without an explicit connection to the underlying high-dimensional structure, limiting insight into the embedding process. In this paper, we introduce a spectral framework for nonlinear DR that addresses these challenges. Our approach embeds high-dimensional data using a spectral basis combined with cross-entropy optimization, enabling multi-scale representations that bridge global and local structure. Leveraging linear spectral decomposition, the framework further supports analysis of embeddings through a graph-frequency perspective, enabling examination of how spectral modes influence the resulting embedding. We complement this analysis with glyph-based scatterplot augmentations for visual exploration. Quantitative evaluations and case studies demonstrate that our framework improves manifold continuity while enabling deeper analysis of embedding structure through spectral mode contributions.

Paper Structure

This paper contains 23 sections, 8 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Comparison of data structure, UMAP mcinnes_umap_2020, and our spectral decomposition snapshots on three datasets. As the number of used spectral modes $S$ increases, low-frequency snapshots recover coarse organization first and then add finer detail toward the full-spectrum embedding.
  • Figure 2: Petal glyph.
  • Figure 3: Reconstruction error across datasets as subspace $S$ increases. The curves show error (y-axis) comparing the embedding with increasing subspace size (x-axis) compared to a full-spectrum embedding. Dashed lines and markers indicate 80%, 90%, and 95% of the final embedding quality. Results are shown for 7 datasets with $N=5000$ samples each, to allow between-dataset comparisons. We use a 15-NN fuzzy graph, 10 progressive stages that equally divide the spectrum, and 500 optimization epochs.
  • Figure 4: Qualitative comparison of embeddings across three representative datasets as \ref{['tab:quantitative_eval_main']}. Progressive stages ($S_{1}$--$S_{10}$) are compared with non-progressive full-spectrum results $S_{\mathrm{full}}$, as well as LE, UMAP, and PHATE; colors indicate branch index, embryo time, or class labels.
  • Figure 5: embeddings for C. elegans (preprocessed). (a) Early stages ($S=5$ and $S=8$) already separate major amphid sensory classes from a compact progenitor and neuroblast states in the center. (b) A full-spectrum last stage embedding ($S_{20}$ with $S = N-1$) colored by embryo development time. Lineage-related states are organized along the branches.
  • ...and 2 more figures