Results of the Big ANN: NeurIPS'23 competition
Harsha Vardhan Simhadri, Martin Aumüller, Amir Ingber, Matthijs Douze, George Williams, Magdalen Dobson Manohar, Dmitry Baranchuk, Edo Liberty, Frank Liu, Ben Landrum, Mazin Karjikar, Laxman Dhulipala, Meng Chen, Yue Chen, Rui Ma, Kai Zhang, Yuzheng Cai, Jiayang Shi, Yizhuo Chen, Weiguo Zheng, Zihao Wan, Jie Yin, Ben Huang
TL;DR
The paper surveys NeurIPS 2023's Big ANN Challenge, which pushes practical ANN indexing and search across four workloads—filtered, out-of-distribution, sparse, and streaming—under constrained hardware. It details the tracks, datasets, evaluation framework, and submission protocol, and reports that top solutions significantly outperform industry baselines through graph-based indexing, quantization, and hybrid vector-metadata strategies. The competition demonstrates notable academic-industrial collaboration, reveals strengths and limitations of current approaches, and provides insights into future directions for robust, real-world ANN systems. An open-source framework and ongoing leaderboard are highlighted to sustain progress and reproducibility in the vector-search community.
Abstract
The 2023 Big ANN Challenge, held at NeurIPS 2023, focused on advancing the state-of-the-art in indexing data structures and search algorithms for practical variants of Approximate Nearest Neighbor (ANN) search that reflect the growing complexity and diversity of workloads. Unlike prior challenges that emphasized scaling up classical ANN search ~\cite{DBLP:conf/nips/SimhadriWADBBCH21}, this competition addressed filtered search, out-of-distribution data, sparse and streaming variants of ANNS. Participants developed and submitted innovative solutions that were evaluated on new standard datasets with constrained computational resources. The results showcased significant improvements in search accuracy and efficiency over industry-standard baselines, with notable contributions from both academic and industrial teams. This paper summarizes the competition tracks, datasets, evaluation metrics, and the innovative approaches of the top-performing submissions, providing insights into the current advancements and future directions in the field of approximate nearest neighbor search.
