Table of Contents
Fetching ...

Graph-Based Nearest-Neighbor Search without the Spread

Jeff Giliberti, Sariel Har-Peled, Jonas Sauer, Ali Vakilian

TL;DR

The work develops a spread-free framework for graph-based approximate nearest neighbor search in spaces with bounded doubling dimension. It combines a reduction to spread-bounded instances via a coarse ANN, a low-quality HST, and a $(1+\\varepsilon)$-ANN structure to achieve near-linear space and $O(\\log n)$ query time; and it introduces a linear-space universal NN graph built from a greedy permutation, complemented by a reverse-tree mechanism and active-resolution slices to precisely target search regions. A key novelty is overlaying scale-specific graphs into a single structure while using a reverse-tree to bypass unhelpful regions, enabling scalable ANN with provable guarantees. The paper also offers a bootstrap-based improvement that reduces query time further, and provides thorough correctness and runtime analyses for both the multiresolution and linear-space constructions. Together, these results advance graph-based ANN by removing dependence on the spread and delivering practical, scalable performance in high-dimensional settings.

Abstract

$\renewcommand{\Re}{\mathbb{R}}$Recent work showed how to construct nearest-neighbor graphs of linear size, on a given set $P$ of $n$ points in $\Re^d$, such that one can answer approximate nearest-neighbor queries in logarithmic time in the spread. Unfortunately, the spread might be unbounded in $n$, and an interesting theoretical question is how to remove the dependency on the spread. Here, we show how to construct an external linear-size data structure that, combined with the linear-size graph, allows us to answer ANN queries in logarithmic time in $n$.

Graph-Based Nearest-Neighbor Search without the Spread

TL;DR

The work develops a spread-free framework for graph-based approximate nearest neighbor search in spaces with bounded doubling dimension. It combines a reduction to spread-bounded instances via a coarse ANN, a low-quality HST, and a -ANN structure to achieve near-linear space and query time; and it introduces a linear-space universal NN graph built from a greedy permutation, complemented by a reverse-tree mechanism and active-resolution slices to precisely target search regions. A key novelty is overlaying scale-specific graphs into a single structure while using a reverse-tree to bypass unhelpful regions, enabling scalable ANN with provable guarantees. The paper also offers a bootstrap-based improvement that reduces query time further, and provides thorough correctness and runtime analyses for both the multiresolution and linear-space constructions. Together, these results advance graph-based ANN by removing dependence on the spread and delivering practical, scalable performance in high-dimensional settings.

Abstract

Recent work showed how to construct nearest-neighbor graphs of linear size, on a given set of points in , such that one can answer approximate nearest-neighbor queries in logarithmic time in the spread. Unfortunately, the spread might be unbounded in , and an interesting theoretical question is how to remove the dependency on the spread. Here, we show how to construct an external linear-size data structure that, combined with the linear-size graph, allows us to answer ANN queries in logarithmic time in .
Paper Structure (35 sections, 21 theorems, 44 equations, 2 figures, 1 table)

This paper contains 35 sections, 21 theorems, 44 equations, 2 figures, 1 table.

Key Result

Theorem 2.12

h-gaa-11. Given a set $\mathsf{P}$ of $n$ points in $\mathbb{R}^d$, for $d \leq n$, one can compute a $2 \space \sqrt{d} n^5$-approximate HST of $\mathsf{P}$ in $O( d n \log n)$ expected time.

Figures (2)

  • Figure 4.1: Left: A point set and its distance in a certain resolution. Right: A slice with its representatives, and the connected components of these representatives within a certain radius.
  • Figure 5.1: Stage I of the ANN search algorithm.

Theorems & Definitions (45)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Definition 2.4
  • Definition 2.5
  • Definition 2.6
  • Definition 2.7
  • Definition 2.8
  • Definition 2.9
  • Definition 2.10
  • ...and 35 more