The Road to the Closest Point is Paved by Good Neighbors
Sariel Har-Peled, Benjamin Raichel, Eliot W. Robson
TL;DR
The paper introduces two graph-based constructions for $(1+\varepsilon)$-ANN in sets of $n$ points in $\mathbb{R}^d$ (or spaces of bounded doubling dimension): (i) a WSPD-derived navigable NN-graph with $O(n/\varepsilon^d)$ edges and a greedy-walk query time of $O(\varepsilon^{-d-1}\log^2 \Psi)$, and (ii) a novel greedy-permutation-based NN-graph that also has $O(n/\varepsilon^d)$ edges but with a greedy walk of $O(\varepsilon^{-d}\log \Psi)$ iterations and $O(\varepsilon^{-d-1}\log^2 \Psi)$ total runtime, improved to $O(\varepsilon^{-d}\log \Psi)$ using early stopping. Both approaches remove spread-dependence in graph size and harness greedy routing to yield efficient, scalable approximate nearest neighbor search, with practical implications for high-dimensional proximity queries. The results leverage well-separated pair decompositions and greedy-order constructions to achieve near-linear space and query-time guarantees, applicable to doubling metrics and Euclidean spaces.
Abstract
$\renewcommand{\Re}{\mathbb{R}}$Given a set $P$ of $n$ points in $\Re^d$, and a parameter $\varepsilon \in (0,1)$, we present a new construction of a directed graph $G$, of size $O(n/\varepsilon^d)$, such that $(1+\varepsilon)$-ANN queries can be answered by performing a greedy walk on $G$, repeatedly moving to a neighbor that is (significantly) better than the current point. To the best of our knowledge, this is the first construction of a linear size with no dependency on the spread of the point set. The resulting query time, is $O( \varepsilon^{-d} \log Ψ)$, where $Ψ$ is the spread of $P$. The new construction is surprisingly simple and should be practical.
