SOAR: Improved Indexing for Approximate Nearest Neighbor Search

Philip Sun; David Simcha; Dave Dopson; Ruiqi Guo; Sanjiv Kumar

SOAR: Improved Indexing for Approximate Nearest Neighbor Search

Philip Sun, David Simcha, Dave Dopson, Ruiqi Guo, Sanjiv Kumar

TL;DR

SOAR addresses ANN indexing quality by coupling spill-based vector quantization with an orthogonality-aware residual loss. The method derives a weighted spill loss, using $\mathcal{L}(r', r, \mathcal{Q}) = \mathbb{E}_{q\in\mathcal{Q}}[ w(\cos\theta) \langle q, r' \rangle^2]$ with $w(t)=|t|^{\lambda}$, and proves that optimal spill targets orthogonal components so that $\mathcal{L}(r', r, \mathcal{Q}) \propto \lambda\|r'_{\parallel}\|^2 + \|r'\|^2$. By training spilled assignments to reduce correlation with the primary residual, SOAR achieves higher KMR quality and substantial end-to-end throughput gains on large-scale datasets with only modest memory overhead. Empirical results show improved KMR curves, reduced angle-correlation between spills, and strong performance advantages against state-of-the-art ANN systems, particularly at massive dataset sizes and higher recall targets. The approach has practical implications for scalable retrieval in applications like language-model context augmentation, image search, and QA, where fast, memory-efficient ANN is essential.

Abstract

This paper introduces SOAR: Spilling with Orthogonality-Amplified Residuals, a novel data indexing technique for approximate nearest neighbor (ANN) search. SOAR extends upon previous approaches to ANN search, such as spill trees, that utilize multiple redundant representations while partitioning the data to reduce the probability of missing a nearest neighbor during search. Rather than training and computing these redundant representations independently, however, SOAR uses an orthogonality-amplified residual loss, which optimizes each representation to compensate for cases where other representations perform poorly. This drastically improves the overall index quality, resulting in state-of-the-art ANN benchmark performance while maintaining fast indexing times and low memory consumption.

SOAR: Improved Indexing for Approximate Nearest Neighbor Search

TL;DR

SOAR addresses ANN indexing quality by coupling spill-based vector quantization with an orthogonality-aware residual loss. The method derives a weighted spill loss, using

with

, and proves that optimal spill targets orthogonal components so that

. By training spilled assignments to reduce correlation with the primary residual, SOAR achieves higher KMR quality and substantial end-to-end throughput gains on large-scale datasets with only modest memory overhead. Empirical results show improved KMR curves, reduced angle-correlation between spills, and strong performance advantages against state-of-the-art ANN systems, particularly at massive dataset sizes and higher recall targets. The approach has practical implications for scalable retrieval in applications like language-model context augmentation, image search, and QA, where fast, memory-efficient ANN is essential.

Abstract

Paper Structure (28 sections, 4 theorems, 15 equations, 12 figures, 2 tables)

This paper contains 28 sections, 4 theorems, 15 equations, 12 figures, 2 tables.

Introduction
Preliminaries and Notation
Maximum inner product search (MIPS)
Vector quantization (VQ)
The k-means recall (KMR) curve
Method
Search difficulty and quantized score error
Quantized score error decomposition
Spilled VQ assignment
Spilling with orthogonality-amplified residuals
Implementation considerations
Spilling to further centroids
Related Works
Spill trees
Graph-based algorithms
...and 13 more sections

Key Result

Theorem 3.1

For the weight function $w(t)=|t|^\lambda$ and a query distribution $\mathcal{Q}$ that is uniformly distributed over the unit hypersphere,

Figures (12)

Figure 1: Greater search difficulty, as quantified by a higher $\textsc{Rank}(q,\mathcal{C}_{\pi(x)},\mathcal{C})$, is associated with highly positive $\left\langle q,r \right\rangle$.
Figure 2: The cosine of the query-residual angle, $\cos\theta$ (left), is far more correlated with $\left\langle q,r \right\rangle$ than the residual norm $\norm{r}$ (right), making the former a more promising target for reducing $\left\langle q,r \right\rangle$.
Figure 3: Naive spilled VQ assignment may be ineffective; selecting the two closest centroids $\mathcal{C}_1$ and $\mathcal{C}_2$ provides no benefit over using just $\mathcal{C}_1$.
Figure 4: On Glove-1M, both a naive top-2 spilled assignment and two separately trained VQ indices exhibit noticeable correlation in query-residual angles; this reduces spilled assignment efficacy.
Figure 5: Memory layout changes for a SOAR-enabled ANN index. Memory footprint is proportional to area of colored cell. VQ centroid data (not shown) remains unchanged. We can see that the additional memory occupied by SOAR (dark blue) is low relative to the total memory consumption.
...and 7 more figures

Theorems & Definitions (6)

Theorem 3.1
proof
Corollary 3.1.1
Corollary 3.1.2
Lemma 3.2
proof

SOAR: Improved Indexing for Approximate Nearest Neighbor Search

TL;DR

Abstract

SOAR: Improved Indexing for Approximate Nearest Neighbor Search

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (6)