Fast High-dimensional Approximate Nearest Neighbor Search with Efficient Index Time and Space
Mingyu Yang, Wentao Li, Wei Wang
TL;DR
This work tackles AKNN search in high-dimensional spaces where pure vector quantization hurts recall. It introduces Minimized Residual Quantization (MRQ), a multi-stage distance correction framework that projects vectors to a lower-dimensional subspace via a PCA-based rotation and decomposes distance into a quantized projection term and a residual term, enabling flexible quantization bit-lengths. MRQ combines a RabitQ-like quantization of the projected part with error-bound based corrections for the residual, providing guarantees on recall while leveraging hardware-efficient computations. Empirically, MRQ delivers up to about 3x speedups with only 1/3 of the original quantized length and negligible index-time/space overhead, outperforming graph- and quantization-based AKNN baselines across diverse datasets.
Abstract
Approximate K nearest neighbor (AKNN) search in high-dimensional Euclidean space is a fundamental problem with widespread applications. Vector quantization which maps vectors to discrete quantized code, can significantly reduce the space cost of AKNN search while also accelerating the AKNN search speed. The exclusive use of vector quantization without precise vectors leads to a substantial decline in search accuracy. Recent research RaBitQ addresses this issue by using geometry relation to enhance quantization accuracy and employing error bound for distance correction with precise vector. However, this method requires that the quantization bit must be equal to the vector dimension resulting in a fixed compression ratio which limits its efficiency and flexibility. In this paper, we propose a new and efficient method MRQ to address this drawback. MRQ leverage leverages data distribution to achieve better distance correction and a higher vector compression ratio. MRQ reduces query latency using a highly efficient distance computation and correction scheme. Our results demonstrate that MRQ significantly outperforms state-of-the-art AKNN search methods based on graph or vector quantization, achieving up to a 3x efficiency speed-up with only 1/3 length of quantized code while maintaining the same accuracy.
