Table of Contents
Fetching ...

B+ANN: A Fast Billion-Scale Disk-based Nearest-Neighbor Index

Selim Furkan Tekin, Rajesh Bordawekar

TL;DR

B+ANN introduces a disk-based, memory-efficient ANN index that partitions data into semantically coherent blocks and uses a B+Tree-inspired structure to store blocks on disk while keeping inner navigation in memory. By adding skip-edge connections and leveraging batch-friendly leaf-level computations, it achieves substantial improvements in cache locality, reduce hops, and enables dissimilarity queries unavailable in typical HNSW-based systems. The approach demonstrates up to 10x–50x speedups over in-memory baselines and dramatic reductions in build time and memory usage for disk-resident indexing, with strong performance on both standard benchmarks and dissimilarity search tasks. It also exploits temporal correlations via semantic views to sustain fast, context-aware searches in interactive, multi-turn AI systems, underscoring practical value for large-scale VDBs and RAG pipelines.

Abstract

Storing and processing of embedding vectors by specialized Vector databases (VDBs) has become the linchpin in building modern AI pipelines. Most current VDBs employ variants of a graph-based ap- proximate nearest-neighbor (ANN) index algorithm, HNSW, to an- swer semantic queries over stored vectors. Inspite of its wide-spread use, the HNSW algorithm suffers from several issues: in-memory design and implementation, random memory accesses leading to degradation in cache behavior, limited acceleration scope due to fine-grained pairwise computations, and support of only semantic similarity queries. In this paper, we present a novel disk-based ANN index, B+ANN, to address these issues: it first partitions input data into blocks containing semantically similar items, then builds an B+ tree variant to store blocks both in-memory and on disks, and finally, enables hybrid edge- and block-based in-memory traversals. As demonstrated by our experimantal evaluation, the proposed B+ANN disk-based index improves both quality (Recall value), and execution performance (Queries per second/QPS) over HNSW, by improving spatial and temporal locality for semantic operations, reducing cache misses (19.23% relative gain), and decreasing the memory consumption and disk-based build time by 24x over the DiskANN algorithm. Finally, it enables dissimilarity queries, which are not supported by similarity-oriented ANN indices.

B+ANN: A Fast Billion-Scale Disk-based Nearest-Neighbor Index

TL;DR

B+ANN introduces a disk-based, memory-efficient ANN index that partitions data into semantically coherent blocks and uses a B+Tree-inspired structure to store blocks on disk while keeping inner navigation in memory. By adding skip-edge connections and leveraging batch-friendly leaf-level computations, it achieves substantial improvements in cache locality, reduce hops, and enables dissimilarity queries unavailable in typical HNSW-based systems. The approach demonstrates up to 10x–50x speedups over in-memory baselines and dramatic reductions in build time and memory usage for disk-resident indexing, with strong performance on both standard benchmarks and dissimilarity search tasks. It also exploits temporal correlations via semantic views to sustain fast, context-aware searches in interactive, multi-turn AI systems, underscoring practical value for large-scale VDBs and RAG pipelines.

Abstract

Storing and processing of embedding vectors by specialized Vector databases (VDBs) has become the linchpin in building modern AI pipelines. Most current VDBs employ variants of a graph-based ap- proximate nearest-neighbor (ANN) index algorithm, HNSW, to an- swer semantic queries over stored vectors. Inspite of its wide-spread use, the HNSW algorithm suffers from several issues: in-memory design and implementation, random memory accesses leading to degradation in cache behavior, limited acceleration scope due to fine-grained pairwise computations, and support of only semantic similarity queries. In this paper, we present a novel disk-based ANN index, B+ANN, to address these issues: it first partitions input data into blocks containing semantically similar items, then builds an B+ tree variant to store blocks both in-memory and on disks, and finally, enables hybrid edge- and block-based in-memory traversals. As demonstrated by our experimantal evaluation, the proposed B+ANN disk-based index improves both quality (Recall value), and execution performance (Queries per second/QPS) over HNSW, by improving spatial and temporal locality for semantic operations, reducing cache misses (19.23% relative gain), and decreasing the memory consumption and disk-based build time by 24x over the DiskANN algorithm. Finally, it enables dissimilarity queries, which are not supported by similarity-oriented ANN indices.

Paper Structure

This paper contains 21 sections, 3 equations, 11 figures, 2 tables, 3 algorithms.

Figures (11)

  • Figure 1: (a) The figure illustrates the HNSW retrieval pattern from the highest to the lowest level. Each visit to a node involves accessing memory and performing a pairwise distance calculation. (b) The retrieving pattern of B+ANN. The first phase is the tree traverse, and the second phase is the skip-edge connections. Each leaf node incurs one memory access, followed by a vector–matrix operation that computes the distances between the query and all vectors stored in the leaf. Note that the number of B+ANN steps (23) is substantially lower than HNSW (140).
  • Figure 2: (a) We show our observation in multi-turn conversation of a RAG system katsis2025mtrag: The probability of retrieving the same document in a conversation with an LLM for each turn. (b) Our proposed view creation system which exploits the temporal relation of successive queries.
  • Figure 3: We show the three phases of B+ANN Indexing. First phase partitions the vector space with hierarchical clustering. The second phase builds the B+ANN tree with skip-edge connections. The third phase indexes the query and performs accesses.
  • Figure 4: In-memory performance of ANN algorithms and B+ANN. We show QPS vs Recall-10 and Recall-100 curves of benchmark algorithms and B+ANN for Arm64 and x86 architectures.
  • Figure 5: B+ANN Performance for the SIFT-1B dataset
  • ...and 6 more figures