SOAR: Improved Indexing for Approximate Nearest Neighbor Search
Philip Sun, David Simcha, Dave Dopson, Ruiqi Guo, Sanjiv Kumar
TL;DR
SOAR addresses ANN indexing quality by coupling spill-based vector quantization with an orthogonality-aware residual loss. The method derives a weighted spill loss, using $\mathcal{L}(r', r, \mathcal{Q}) = \mathbb{E}_{q\in\mathcal{Q}}[ w(\cos\theta) \langle q, r' \rangle^2]$ with $w(t)=|t|^{\lambda}$, and proves that optimal spill targets orthogonal components so that $\mathcal{L}(r', r, \mathcal{Q}) \propto \lambda\|r'_{\parallel}\|^2 + \|r'\|^2$. By training spilled assignments to reduce correlation with the primary residual, SOAR achieves higher KMR quality and substantial end-to-end throughput gains on large-scale datasets with only modest memory overhead. Empirical results show improved KMR curves, reduced angle-correlation between spills, and strong performance advantages against state-of-the-art ANN systems, particularly at massive dataset sizes and higher recall targets. The approach has practical implications for scalable retrieval in applications like language-model context augmentation, image search, and QA, where fast, memory-efficient ANN is essential.
Abstract
This paper introduces SOAR: Spilling with Orthogonality-Amplified Residuals, a novel data indexing technique for approximate nearest neighbor (ANN) search. SOAR extends upon previous approaches to ANN search, such as spill trees, that utilize multiple redundant representations while partitioning the data to reduce the probability of missing a nearest neighbor during search. Rather than training and computing these redundant representations independently, however, SOAR uses an orthogonality-amplified residual loss, which optimizes each representation to compensate for cases where other representations perform poorly. This drastically improves the overall index quality, resulting in state-of-the-art ANN benchmark performance while maintaining fast indexing times and low memory consumption.
