NOMAD Projection
Brandon Duderstadt, Zach Nussbaum, Laurens van der Maaten
TL;DR
This paper introduces NOMAD Projection, a distributed nonlinear dimensionality reduction method for unstructured data visualization that can train across multiple GPUs by approximating an upper bound on the InfoNC-t-SNE loss. By leveraging a cluster-based ANN index for positive forces and a surrogate loss using cluster means for negative forces, NOMAD Projection dramatically improves scalability while preserving local and, to some extent, global structure. The authors provide theoretical bounds linking NOMAD Projection to InfoNC-t-SNE and demonstrate strong empirical performance on ArXiv, ImageNet, PubMed, and a 60-million-point Multilingual Wikipedia map, often outperforming or matching GPU-based baselines with significantly reduced wall-clock time. The work enables large-scale, explainable data visualizations and opens avenues for multi-node extensions and broader applications in contrastive learning and language modeling.
Abstract
The rapid adoption of generative AI has driven an explosion in the size of datasets consumed and produced by AI models. Traditional methods for unstructured data visualization, such as t-SNE and UMAP, have not kept up with the pace of dataset scaling. This presents a significant challenge for AI explainability, which relies on methods such as t-SNE and UMAP for exploratory data analysis. In this paper, we introduce Negative Or Mean Affinity Discrimination (NOMAD) Projection, the first method for unstructured data visualization via nonlinear dimensionality reduction that can run on multiple GPUs at train time. We provide theory that situates NOMAD Projection as an approximate upper bound on the InfoNC-t-SNE loss, and empirical results that demonstrate NOMAD Projection's superior performance and speed profile compared to existing state-of-the-art methods. We demonstrate the scalability of NOMAD Projection by computing the first complete data map of Multilingual Wikipedia.
