Table of Contents
Fetching ...

Attraction-Repulsion Spectrum in Neighbor Embeddings

Jan Niklas Böhm, Philipp Berens, Dmitry Kobak

TL;DR

This paper reveals an attraction-repulsion spectrum for neighbor embeddings, showing that the balance between attractive $k$NN edges and global repulsion shapes how embeddings capture continuous versus discrete structure. By analyzing t-SNE with varying exaggeration $\rho$, and situating UMAP and ForceAtlas2 on this spectrum, it demonstrates that high attraction favors global continuity while high repulsion emphasizes cluster structure. The authors provide mathematical links to Laplacian Eigenmaps and show that UMAP’s negative sampling effectively reduces repulsion, placing it near moderate attraction on the spectrum; FA2 aligns with stronger attraction due to non-decaying attractive forces. Practically, this work guides method choice based on the data’s structure (trajectories vs clusters) and highlights how optimization tricks and sampling schemes influence embedding geometry.

Abstract

Neighbor embeddings are a family of methods for visualizing complex high-dimensional datasets using $k$NN graphs. To find the low-dimensional embedding, these algorithms combine an attractive force between neighboring pairs of points with a repulsive force between all points. One of the most popular examples of such algorithms is t-SNE. Here we empirically show that changing the balance between the attractive and the repulsive forces in t-SNE using the exaggeration parameter yields a spectrum of embeddings, which is characterized by a simple trade-off: stronger attraction can better represent continuous manifold structures, while stronger repulsion can better represent discrete cluster structures and yields higher $k$NN recall. We find that UMAP embeddings correspond to t-SNE with increased attraction; mathematical analysis shows that this is because the negative sampling optimisation strategy employed by UMAP strongly lowers the effective repulsion. Likewise, ForceAtlas2, commonly used for visualizing developmental single-cell transcriptomic data, yields embeddings corresponding to t-SNE with the attraction increased even more. At the extreme of this spectrum lie Laplacian Eigenmaps. Our results demonstrate that many prominent neighbor embedding algorithms can be placed onto the attraction-repulsion spectrum, and highlight the inherent trade-offs between them.

Attraction-Repulsion Spectrum in Neighbor Embeddings

TL;DR

This paper reveals an attraction-repulsion spectrum for neighbor embeddings, showing that the balance between attractive NN edges and global repulsion shapes how embeddings capture continuous versus discrete structure. By analyzing t-SNE with varying exaggeration , and situating UMAP and ForceAtlas2 on this spectrum, it demonstrates that high attraction favors global continuity while high repulsion emphasizes cluster structure. The authors provide mathematical links to Laplacian Eigenmaps and show that UMAP’s negative sampling effectively reduces repulsion, placing it near moderate attraction on the spectrum; FA2 aligns with stronger attraction due to non-decaying attractive forces. Practically, this work guides method choice based on the data’s structure (trajectories vs clusters) and highlights how optimization tricks and sampling schemes influence embedding geometry.

Abstract

Neighbor embeddings are a family of methods for visualizing complex high-dimensional datasets using NN graphs. To find the low-dimensional embedding, these algorithms combine an attractive force between neighboring pairs of points with a repulsive force between all points. One of the most popular examples of such algorithms is t-SNE. Here we empirically show that changing the balance between the attractive and the repulsive forces in t-SNE using the exaggeration parameter yields a spectrum of embeddings, which is characterized by a simple trade-off: stronger attraction can better represent continuous manifold structures, while stronger repulsion can better represent discrete cluster structures and yields higher NN recall. We find that UMAP embeddings correspond to t-SNE with increased attraction; mathematical analysis shows that this is because the negative sampling optimisation strategy employed by UMAP strongly lowers the effective repulsion. Likewise, ForceAtlas2, commonly used for visualizing developmental single-cell transcriptomic data, yields embeddings corresponding to t-SNE with the attraction increased even more. At the extreme of this spectrum lie Laplacian Eigenmaps. Our results demonstrate that many prominent neighbor embedding algorithms can be placed onto the attraction-repulsion spectrum, and highlight the inherent trade-offs between them.

Paper Structure

This paper contains 14 sections, 17 equations, 25 figures.

Figures (25)

  • Figure 1: Attraction-repulsion spectrum for the MNIST data. Different embeddings of the MNIST data set of hand-written digits ($n=70\,000$); colors denote digits as shown in the t-SNE panel. Multiplying all attractive forces by an exaggeration factor $\rho$ yields a spectrum of embeddings. Values below 1 yield inflated clusters. Values above 1 yield more compact clusters. Higher values make multiple clusters merge, with $\rho\to\infty$ approximately corresponding to Laplacian eigenmaps linderman2019clustering. Insets show two subsets of digits separated in higher eigenvectors. UMAP is similar to $\rho\approx 4$. ForceAtlas2 is similar to $\rho\approx 30$.
  • Figure 2: The role of affinities in t-SNE. MNIST data set. (a) Default t-SNE, Gaussian affinities, perplexity 30. (b) t-SNE with binary $k$NN affinities: all nonzero $p_{ij}$ are the same, and $p_{ij}>0$ iff point $i$ is among 15 nearest neighbors of point $j$, or vice versa.
  • Figure 3: UMAP with various simplifications. MNIST data set. (a) Default UMAP with $a\approx1.6$ and $b\approx 0.9$ and LE initialization. (b) UMAP with $a=b=1$ and PCA initialization, the default choice for our experiments. (c) The same as in (b), but using binary $k$NN affinities ($v_{ij} = 1$ iff point $i$ is among $15$ nearest neigbors of point $j$, or vice versa). (d) The same as in (c), but with $\epsilon=1$.
  • Figure 4: The effect of edge repulsion in Force-Atlas2. MNIST data set. (a) FA2 with repulsion by degree. (b) FA2 without repulsion by degree. Note the difference in scale.
  • Figure 5: Simulated data emulating a developmental trajectory. The points were sampled from 20 isotropic 50-dimensional Gaussians, equally spaced along one axis such that only few inter-cluster edges exist in the $k$NN graph. Panels (b--f) used a shared random initialization. Panels (b--d) did not use early exaggeration.
  • ...and 20 more figures