Graph-Based Nearest-Neighbor Search without the Spread
Jeff Giliberti, Sariel Har-Peled, Jonas Sauer, Ali Vakilian
TL;DR
The work develops a spread-free framework for graph-based approximate nearest neighbor search in spaces with bounded doubling dimension. It combines a reduction to spread-bounded instances via a coarse ANN, a low-quality HST, and a $(1+\\varepsilon)$-ANN structure to achieve near-linear space and $O(\\log n)$ query time; and it introduces a linear-space universal NN graph built from a greedy permutation, complemented by a reverse-tree mechanism and active-resolution slices to precisely target search regions. A key novelty is overlaying scale-specific graphs into a single structure while using a reverse-tree to bypass unhelpful regions, enabling scalable ANN with provable guarantees. The paper also offers a bootstrap-based improvement that reduces query time further, and provides thorough correctness and runtime analyses for both the multiresolution and linear-space constructions. Together, these results advance graph-based ANN by removing dependence on the spread and delivering practical, scalable performance in high-dimensional settings.
Abstract
$\renewcommand{\Re}{\mathbb{R}}$Recent work showed how to construct nearest-neighbor graphs of linear size, on a given set $P$ of $n$ points in $\Re^d$, such that one can answer approximate nearest-neighbor queries in logarithmic time in the spread. Unfortunately, the spread might be unbounded in $n$, and an interesting theoretical question is how to remove the dependency on the spread. Here, we show how to construct an external linear-size data structure that, combined with the linear-size graph, allows us to answer ANN queries in logarithmic time in $n$.
