Table of Contents
Fetching ...

Distance Adaptive Beam Search for Provably Accurate Graph-Based Nearest Neighbor Search

Yousef Al-Jazzazi, Haya Diwan, Jinrui Gou, Cameron Musco, Christopher Musco, Torsten Suel

TL;DR

The paper tackles the theoretical-practical gap in graph-based ANN by introducing Adaptive Beam Search, a distance-based termination rule that replaces fixed beam width. By decoupling search order from the stopping criterion and parameterizing the termination with $\gamma$, the authors prove guarantees on navigable graphs and demonstrate substantial empirical gains over standard beam search across multiple datasets and graph constructions. Theoretical results show that on navigable graphs, the method yields provable approximate nearest neighbors, with exactness attainable at $\gamma=2$, while experiments report consistent reductions in distance computations (roughly 10–50%) for given recall. The work offers both a principled understanding of graph navigability in ANN and a practical, easily adoptable improvement for popular graph-based methods like HNSW, Vamana, and NSG.

Abstract

Nearest neighbor search is central in machine learning, information retrieval, and databases. For high-dimensional datasets, graph-based methods such as HNSW, DiskANN, and NSG have become popular thanks to their empirical accuracy and efficiency. These methods construct a directed graph over the dataset and perform beam search on the graph to find nodes close to a given query. While significant work has focused on practical refinements and theoretical understanding of graph-based methods, many questions remain. We propose a new distance-based termination condition for beam search to replace the commonly used condition based on beam width. We prove that, as long as the search graph is navigable, our resulting Adaptive Beam Search method is guaranteed to approximately solve the nearest-neighbor problem, establishing a connection between navigability and the performance of graph-based search. We also provide extensive experiments on our new termination condition for both navigable graphs and approximately navigable graphs used in practice, such as HNSW and Vamana graphs. We find that Adaptive Beam Search outperforms standard beam search over a range of recall values, data sets, graph constructions, and target number of nearest neighbors. It thus provides a simple and practical way to improve the performance of popular methods.

Distance Adaptive Beam Search for Provably Accurate Graph-Based Nearest Neighbor Search

TL;DR

The paper tackles the theoretical-practical gap in graph-based ANN by introducing Adaptive Beam Search, a distance-based termination rule that replaces fixed beam width. By decoupling search order from the stopping criterion and parameterizing the termination with , the authors prove guarantees on navigable graphs and demonstrate substantial empirical gains over standard beam search across multiple datasets and graph constructions. Theoretical results show that on navigable graphs, the method yields provable approximate nearest neighbors, with exactness attainable at , while experiments report consistent reductions in distance computations (roughly 10–50%) for given recall. The work offers both a principled understanding of graph navigability in ANN and a practical, easily adoptable improvement for popular graph-based methods like HNSW, Vamana, and NSG.

Abstract

Nearest neighbor search is central in machine learning, information retrieval, and databases. For high-dimensional datasets, graph-based methods such as HNSW, DiskANN, and NSG have become popular thanks to their empirical accuracy and efficiency. These methods construct a directed graph over the dataset and perform beam search on the graph to find nodes close to a given query. While significant work has focused on practical refinements and theoretical understanding of graph-based methods, many questions remain. We propose a new distance-based termination condition for beam search to replace the commonly used condition based on beam width. We prove that, as long as the search graph is navigable, our resulting Adaptive Beam Search method is guaranteed to approximately solve the nearest-neighbor problem, establishing a connection between navigability and the performance of graph-based search. We also provide extensive experiments on our new termination condition for both navigable graphs and approximately navigable graphs used in practice, such as HNSW and Vamana graphs. We find that Adaptive Beam Search outperforms standard beam search over a range of recall values, data sets, graph constructions, and target number of nearest neighbors. It thus provides a simple and practical way to improve the performance of popular methods.

Paper Structure

This paper contains 22 sections, 1 theorem, 7 equations, 10 figures, 3 tables, 4 algorithms.

Key Result

Theorem 1

Suppose $d$ is a metric on $\mathcal{X}$ and $G$ is navigable under $d$. Then for any query $q \in \mathcal{X}$, if Adaptive Beam Search -- i.e., alg:gen_beam_search with stopping criterion eq:our_rule -- is run with parameter $0 < \gamma \leq 2$, it is guaranteed to return a set of $k$ points $\mat

Figures (10)

  • Figure 1: Histograms for the number of distance computations performed by standard beam search and our Adaptive Beam Search method when answering 10,000 queries for various datasets and search graphs (see \ref{['sec:experiments']} for details). For a fair comparison, the $b$ parameter in beam search and $\gamma$ parameter in Adaptive Beam Search were tuned to achieve a fixed level of recall for the batch of queries. The histograms for Adaptive Beam Search are consistently flatter, confirming the intuition that it better adapts to query difficulty, leading to fewer distance computations on average.
  • Figure 2: Visualization of the proof of \ref{['thm:main']}. We let $\tilde{d}$ denote $d(q,\tilde{x})$. Our goal is to show that there is no undiscovered $z$ in a ball of radius $\frac{\gamma}{2}\tilde{d}$ around $q$, which is shown with a dotted line. If there was, we obtain a contradiction. In particular, if $G$ is navigable, we argued that there must be some unexpanded node $w$ on a path of decreasing distance from $\tilde{x}$ to $z$. Since $w$ is closer to $z$ than $\tilde{x}$, it must lie in a ball of radius $\left(1+\frac{\gamma}{2}\right)\tilde{d}$ around $z$, which is contained in a ball of radius $(1+\gamma)$ around $q$. However, by \ref{['eq:c_condition']}, no unexpanded node can lie in that ball.
  • Figure 3: Navigable Graphs: Comparison of generalized beam search termination conditions on navigable graphs across three datasets: SIFT1M, DEEP96, and MNIST (columns), with $k = 1$, and $k = 10$ (rows). Adaptive Beam Search consistently outperforms standard beam search, while the alternative Adaptive Beam Search V2 underperforms both by a significant margin. Note that for $k = 1$, Adaptive Beam Search and Adaptive Beam Search V2 are identical, so only one line is shown.
  • Figure 4: Heuristic Graphs: Comparison of generalized beam search termination methods on heuristic graphs produced by NSG, Vamana, EFANNA, and HNSW (rows), for $k = 10$ with 3 datasets: SIFT1M, DEEP256, and MNIST (columns). Adaptive beam search consistently outperforms standard beam search across all cases, sometimes by a significant margin.
  • Figure 5: Example showing that standard beam search fails to find a nearest neighbor in a navigable graph. Points $\mathbf{x}_4,\ldots, \mathbf{x}_n$ are all located arbitrarily close to $(1,0)$. They are all connected to $\mathbf{x}_1$ and $\mathbf{x}_2$, as well as to each other. The graph is navigable, since we can navigate from $\mathbf{x}_1, \mathbf{x}_4,\ldots, \mathbf{x}_n$ to $\mathbf{x}_3$ and vice-versa through $\mathbf{x}_2$. All other nodes are directly connected to each other. Suppose beam search with beam width $b \le n-3$ is initialized at $\mathbf{x}_1$ with query $\mathbf{q}$. Because $\mathbf{x}_4,\ldots, \mathbf{x}_n$ are all closer to the $\mathbf{q}$ than $\mathbf{x}_2$, the method will never expand $\mathbf{x}_2$ and thus fail to reach the nearest neighbor $\mathbf{x}_3$.
  • ...and 5 more figures

Theorems & Definitions (11)

  • Definition 1: Navigable Graph
  • Theorem 1
  • Claim 2
  • proof : Proof of \ref{['thm:main']}
  • Claim 3
  • Claim 4
  • proof
  • Claim 5
  • proof
  • Claim 6
  • ...and 1 more