B+ANN: A Fast Billion-Scale Disk-based Nearest-Neighbor Index
Selim Furkan Tekin, Rajesh Bordawekar
TL;DR
B+ANN introduces a disk-based, memory-efficient ANN index that partitions data into semantically coherent blocks and uses a B+Tree-inspired structure to store blocks on disk while keeping inner navigation in memory. By adding skip-edge connections and leveraging batch-friendly leaf-level computations, it achieves substantial improvements in cache locality, reduce hops, and enables dissimilarity queries unavailable in typical HNSW-based systems. The approach demonstrates up to 10x–50x speedups over in-memory baselines and dramatic reductions in build time and memory usage for disk-resident indexing, with strong performance on both standard benchmarks and dissimilarity search tasks. It also exploits temporal correlations via semantic views to sustain fast, context-aware searches in interactive, multi-turn AI systems, underscoring practical value for large-scale VDBs and RAG pipelines.
Abstract
Storing and processing of embedding vectors by specialized Vector databases (VDBs) has become the linchpin in building modern AI pipelines. Most current VDBs employ variants of a graph-based ap- proximate nearest-neighbor (ANN) index algorithm, HNSW, to an- swer semantic queries over stored vectors. Inspite of its wide-spread use, the HNSW algorithm suffers from several issues: in-memory design and implementation, random memory accesses leading to degradation in cache behavior, limited acceleration scope due to fine-grained pairwise computations, and support of only semantic similarity queries. In this paper, we present a novel disk-based ANN index, B+ANN, to address these issues: it first partitions input data into blocks containing semantically similar items, then builds an B+ tree variant to store blocks both in-memory and on disks, and finally, enables hybrid edge- and block-based in-memory traversals. As demonstrated by our experimantal evaluation, the proposed B+ANN disk-based index improves both quality (Recall value), and execution performance (Queries per second/QPS) over HNSW, by improving spatial and temporal locality for semantic operations, reducing cache misses (19.23% relative gain), and decreasing the memory consumption and disk-based build time by 24x over the DiskANN algorithm. Finally, it enables dissimilarity queries, which are not supported by similarity-oriented ANN indices.
