Table of Contents
Fetching ...

The Road to the Closest Point is Paved by Good Neighbors

Sariel Har-Peled, Benjamin Raichel, Eliot W. Robson

TL;DR

The paper introduces two graph-based constructions for $(1+\varepsilon)$-ANN in sets of $n$ points in $\mathbb{R}^d$ (or spaces of bounded doubling dimension): (i) a WSPD-derived navigable NN-graph with $O(n/\varepsilon^d)$ edges and a greedy-walk query time of $O(\varepsilon^{-d-1}\log^2 \Psi)$, and (ii) a novel greedy-permutation-based NN-graph that also has $O(n/\varepsilon^d)$ edges but with a greedy walk of $O(\varepsilon^{-d}\log \Psi)$ iterations and $O(\varepsilon^{-d-1}\log^2 \Psi)$ total runtime, improved to $O(\varepsilon^{-d}\log \Psi)$ using early stopping. Both approaches remove spread-dependence in graph size and harness greedy routing to yield efficient, scalable approximate nearest neighbor search, with practical implications for high-dimensional proximity queries. The results leverage well-separated pair decompositions and greedy-order constructions to achieve near-linear space and query-time guarantees, applicable to doubling metrics and Euclidean spaces.

Abstract

$\renewcommand{\Re}{\mathbb{R}}$Given a set $P$ of $n$ points in $\Re^d$, and a parameter $\varepsilon \in (0,1)$, we present a new construction of a directed graph $G$, of size $O(n/\varepsilon^d)$, such that $(1+\varepsilon)$-ANN queries can be answered by performing a greedy walk on $G$, repeatedly moving to a neighbor that is (significantly) better than the current point. To the best of our knowledge, this is the first construction of a linear size with no dependency on the spread of the point set. The resulting query time, is $O( \varepsilon^{-d} \log Ψ)$, where $Ψ$ is the spread of $P$. The new construction is surprisingly simple and should be practical.

The Road to the Closest Point is Paved by Good Neighbors

TL;DR

The paper introduces two graph-based constructions for -ANN in sets of points in (or spaces of bounded doubling dimension): (i) a WSPD-derived navigable NN-graph with edges and a greedy-walk query time of , and (ii) a novel greedy-permutation-based NN-graph that also has edges but with a greedy walk of iterations and total runtime, improved to using early stopping. Both approaches remove spread-dependence in graph size and harness greedy routing to yield efficient, scalable approximate nearest neighbor search, with practical implications for high-dimensional proximity queries. The results leverage well-separated pair decompositions and greedy-order constructions to achieve near-linear space and query-time guarantees, applicable to doubling metrics and Euclidean spaces.

Abstract

Given a set of points in , and a parameter , we present a new construction of a directed graph , of size , such that -ANN queries can be answered by performing a greedy walk on , repeatedly moving to a neighbor that is (significantly) better than the current point. To the best of our knowledge, this is the first construction of a linear size with no dependency on the spread of the point set. The resulting query time, is , where is the spread of . The new construction is surprisingly simple and should be practical.

Paper Structure

This paper contains 20 sections, 8 theorems, 25 equations, 3 figures, 1 table.

Key Result

Theorem 2.12

For $\varepsilon \in (0,1)$, and a set $\mathsf{P}$ of $n$ points in $\mathbb{R}^d$, one can construct, in $O \bigl( n \log n + {n}/{ \varepsilon^{d}} \bigr)$ time, an $\tfrac{1}{\varepsilon}$-WSPD of $\mathsf{P}$ of size $O(n/{ \varepsilon^{d}})$.

Figures (3)

  • Figure 2.1: Left: The points selected by robust prune, with $\alpha=4$, where the original set of $\approx 200,000$ points is uniformly distributed in the square, except for a disallowed "island" in the middle. Right: The Apollonius disks that were used during this process. (We have not shown the original point set, as it simply forms a solid blob, and that seemed pointless [or is it pointfull?].)
  • Figure 3.1:
  • Figure 4.1: Illustration of proof.

Theorems & Definitions (24)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Definition 2.4
  • Definition 2.5
  • Definition 2.6
  • Definition 2.8
  • Definition 2.9
  • Definition 2.10
  • Definition 2.11
  • ...and 14 more