Table of Contents
Fetching ...

Barnes-Hut-SNE

Laurens van der Maaten

TL;DR

The paper tackles scaling t-SNE to large datasets by introducing Barnes-Hut-SNE, which sparsifies input similarities with a Vantage-Point Tree and approximates embedding gradients via a Barnes-Hut N-body approach. It achieves $O(N \log N)$ time and $O(N)$ memory, enabling embeddings for millions of points while preserving local structure. Experiments on MNIST, CIFAR-10, NORB, and TIMIT show substantial speedups with minimal loss in embedding quality compared to standard t-SNE. The authors discuss limitations such as lack of error bounds and 2D/3D embedding restriction, and propose future work on error bounds, higher-dimensional generalizations, and parallelization.

Abstract

The paper presents an O(N log N)-implementation of t-SNE -- an embedding technique that is commonly used for the visualization of high-dimensional data in scatter plots and that normally runs in O(N^2). The new implementation uses vantage-point trees to compute sparse pairwise similarities between the input data objects, and it uses a variant of the Barnes-Hut algorithm - an algorithm used by astronomers to perform N-body simulations - to approximate the forces between the corresponding points in the embedding. Our experiments show that the new algorithm, called Barnes-Hut-SNE, leads to substantial computational advantages over standard t-SNE, and that it makes it possible to learn embeddings of data sets with millions of objects.

Barnes-Hut-SNE

TL;DR

The paper tackles scaling t-SNE to large datasets by introducing Barnes-Hut-SNE, which sparsifies input similarities with a Vantage-Point Tree and approximates embedding gradients via a Barnes-Hut N-body approach. It achieves time and memory, enabling embeddings for millions of points while preserving local structure. Experiments on MNIST, CIFAR-10, NORB, and TIMIT show substantial speedups with minimal loss in embedding quality compared to standard t-SNE. The authors discuss limitations such as lack of error bounds and 2D/3D embedding restriction, and propose future work on error bounds, higher-dimensional generalizations, and parallelization.

Abstract

The paper presents an O(N log N)-implementation of t-SNE -- an embedding technique that is commonly used for the visualization of high-dimensional data in scatter plots and that normally runs in O(N^2). The new implementation uses vantage-point trees to compute sparse pairwise similarities between the input data objects, and it uses a variant of the Barnes-Hut algorithm - an algorithm used by astronomers to perform N-body simulations - to approximate the forces between the corresponding points in the embedding. Our experiments show that the new algorithm, called Barnes-Hut-SNE, leads to substantial computational advantages over standard t-SNE, and that it makes it possible to learn embeddings of data sets with millions of objects.

Paper Structure

This paper contains 9 sections, 9 equations, 7 figures.

Figures (7)

  • Figure 1: Quadtree constructed on a two-dimensional t-SNE embedding of $500$ MNIST digits (the colors of the points correspond to the digit classes). Note how the quadtree adapts to the local point density in the embedding.
  • Figure 2: Computation time (in seconds) required to embed $70,000$ MNIST digits using Barnes-Hut-SNE (left) and the $1$-nearest neighbor errors of the corresponding embeddings (right) as a function of the trade-off parameter $\theta$.
  • Figure 3: Compution time (in seconds) required to embed MNIST digits (left) and the $1$-nearest neighbor errors of the corresponding embeddings (right) as a function of data set size $N$ for both standard t-SNE and Barnes-Hut-SNE. Note that the required computation time, which is shown on the $y$-axis of the left figure, is plotted on a logarithmic scale.
  • Figure 4: Barnes-Hut-SNE visualizations of four data sets: MNIST handwritten digits (top-left), CIFAR-10 tiny images (top-right), NORB object images (bottom-left), and TIMIT speech frames (bottom-right). The colors of the point indicate the classes of the corresponding objects. The titles of the figures indicate the computation time that was used to construct the corresponding embeddings. Figure best viewed in color.
  • Figure 5: Barnes-Hut-SNE visualization of all $70,000$ MNIST handwritten digit images (constructed in 10 minutes and 45 seconds). Zoom in on the visualization for more detailed views.
  • ...and 2 more figures