Table of Contents
Fetching ...

SAQ: Pushing the Limits of Vector Quantization through Code Adjustment and Dimension Segmentation

Hui Li, Shiyuan Deng, Xiao Yan, Xiangyu Zhi, James Cheng

TL;DR

SAQ introduces two core ideas—code adjustment (CAQ) and dimension segmentation (SAQ)—to advance vector quantization for ANNS. CAQ replaces expensive per-vector enumeration with a fast, coordinate-descent-style refinement that aligns quantized dimensions with the original vector, achieving $O(D)$ encoding. SAQ further improves accuracy by segmenting PCA-projected dimensions and optimally allocating bits across segments via dynamic programming under a total quota $Q_{quota}$, supported by a multi-stage distance estimator for pruning. Together, SAQ and CAQ deliver substantial gains over state-of-the-art methods, with reported improvements in quantization error, encoding speed, and query throughput across multiple real-world datasets. These contributions offer a practical, scalable path to high-accuracy, efficient vector quantization for large-scale ANNS systems.

Abstract

Approximate Nearest Neighbor Search (ANNS) plays a critical role in applications such as search engines, recommender systems, and RAG for LLMs. Vector quantization (VQ), a crucial technique for ANNS, is commonly used to reduce space overhead and accelerate distance computations. However, despite significant research advances, state-of-the-art VQ methods still face challenges in balancing encoding efficiency and quantization accuracy. To address these limitations, we propose a novel VQ method called SAQ. To improve accuracy, SAQ employs a new dimension segmentation technique to strategically partition PCA-projected vectors into segments along their dimensions. By prioritizing leading dimension segments with larger magnitudes, SAQ allocates more bits to high-impact segments, optimizing the use of the available space quota. An efficient dynamic programming algorithm is developed to optimize dimension segmentation and bit allocation, ensuring minimal quantization error. To speed up vector encoding, SAQ devises a code adjustment technique to first quantize each dimension independently and then progressively refine quantized vectors using a coordinate-descent-like approach to avoid exhaustive enumeration. Extensive experiments demonstrate SAQ's superiority over classical methods (e.g., PQ, PCA) and recent state-of-the-art approaches (e.g., LVQ, Extended RabitQ). SAQ achieves up to 80% reduction in quantization error and accelerates encoding speed by over 80x compared to Extended RabitQ.

SAQ: Pushing the Limits of Vector Quantization through Code Adjustment and Dimension Segmentation

TL;DR

SAQ introduces two core ideas—code adjustment (CAQ) and dimension segmentation (SAQ)—to advance vector quantization for ANNS. CAQ replaces expensive per-vector enumeration with a fast, coordinate-descent-style refinement that aligns quantized dimensions with the original vector, achieving encoding. SAQ further improves accuracy by segmenting PCA-projected dimensions and optimally allocating bits across segments via dynamic programming under a total quota , supported by a multi-stage distance estimator for pruning. Together, SAQ and CAQ deliver substantial gains over state-of-the-art methods, with reported improvements in quantization error, encoding speed, and query throughput across multiple real-world datasets. These contributions offer a practical, scalable path to high-accuracy, efficient vector quantization for large-scale ANNS systems.

Abstract

Approximate Nearest Neighbor Search (ANNS) plays a critical role in applications such as search engines, recommender systems, and RAG for LLMs. Vector quantization (VQ), a crucial technique for ANNS, is commonly used to reduce space overhead and accelerate distance computations. However, despite significant research advances, state-of-the-art VQ methods still face challenges in balancing encoding efficiency and quantization accuracy. To address these limitations, we propose a novel VQ method called SAQ. To improve accuracy, SAQ employs a new dimension segmentation technique to strategically partition PCA-projected vectors into segments along their dimensions. By prioritizing leading dimension segments with larger magnitudes, SAQ allocates more bits to high-impact segments, optimizing the use of the available space quota. An efficient dynamic programming algorithm is developed to optimize dimension segmentation and bit allocation, ensuring minimal quantization error. To speed up vector encoding, SAQ devises a code adjustment technique to first quantize each dimension independently and then progressively refine quantized vectors using a coordinate-descent-like approach to avoid exhaustive enumeration. Extensive experiments demonstrate SAQ's superiority over classical methods (e.g., PQ, PCA) and recent state-of-the-art approaches (e.g., LVQ, Extended RabitQ). SAQ achieves up to 80% reduction in quantization error and accelerates encoding speed by over 80x compared to Extended RabitQ.

Paper Structure

This paper contains 18 sections, 2 theorems, 21 equations, 12 figures, 6 tables, 2 algorithms.

Key Result

lemma 1

The estimator of inner product is unbiased because : With a probability of at least $1-\exp(-c_0\epsilon_0^2)$, the error bound of the estimator satisfies where $c_0$ is a constant and $\epsilon_0$ is a parameter that controls the probability of failure of the bound.

Figures (12)

  • Figure 1: Illustration of dimension balancing and dimension reduction. Bar height is the magnitude of vector dimension.
  • Figure 2: The vector approximation error of SAQ and representative baselines for the GIST dataset. Note that RaBitQ refers to extended RaBitQ (same for other experiments).
  • Figure 3: The codebook structure of extended RaBitQ with dimension $D\!=\!2$ and $B\!=\!2$ bits for each dimension. Red points are the final codewords, Figure reproduced from extrbq.
  • Figure 4: The procedure of CAQ for quantization, which first starts with LVQ and then adjusts the quantized vector to align with the data vector in direction.
  • Figure 5: Variance of vector dimension after PCA projection.
  • ...and 7 more figures

Theorems & Definitions (3)

  • lemma 1: Estimator and Error Bound
  • Remark 1: Error Bound
  • lemma 2