Barnes-Hut-SNE
Laurens van der Maaten
TL;DR
The paper tackles scaling t-SNE to large datasets by introducing Barnes-Hut-SNE, which sparsifies input similarities with a Vantage-Point Tree and approximates embedding gradients via a Barnes-Hut N-body approach. It achieves $O(N \log N)$ time and $O(N)$ memory, enabling embeddings for millions of points while preserving local structure. Experiments on MNIST, CIFAR-10, NORB, and TIMIT show substantial speedups with minimal loss in embedding quality compared to standard t-SNE. The authors discuss limitations such as lack of error bounds and 2D/3D embedding restriction, and propose future work on error bounds, higher-dimensional generalizations, and parallelization.
Abstract
The paper presents an O(N log N)-implementation of t-SNE -- an embedding technique that is commonly used for the visualization of high-dimensional data in scatter plots and that normally runs in O(N^2). The new implementation uses vantage-point trees to compute sparse pairwise similarities between the input data objects, and it uses a variant of the Barnes-Hut algorithm - an algorithm used by astronomers to perform N-body simulations - to approximate the forces between the corresponding points in the embedding. Our experiments show that the new algorithm, called Barnes-Hut-SNE, leads to substantial computational advantages over standard t-SNE, and that it makes it possible to learn embeddings of data sets with millions of objects.
