Table of Contents
Fetching ...

LIRA: A Learning-based Query-aware Partition Framework for Large-scale ANN Search

Ximu Zeng, Liwei Deng, Penghao Chen, Xu Chen, Han Su, Kai Zheng

TL;DR

This work tackles inefficiencies in partition-based ANN search caused by probing waste and long-tail $k$NN distributions. It introduces LIRA, a learning-based, query-aware meta-index that directly predicts the $k$NN partitions for each query and employs a learning-based redundancy strategy to duplicate data points into replica partitions, enabling adaptive per-query $nprobe$. Through extensive experiments on five real-world high-dimensional datasets, LIRA consistently improves recall and reduces probe counts compared with IV* and BLISS baselines, with gains amplified at high recall levels. The approach offers practical scalability for large-scale ANN systems by combining a trainable probing model with targeted data redundancy and effective two-level indexing, and code is available for replication at the provided GitHub repository.

Abstract

Approximate nearest neighbor search is fundamental in information retrieval. Previous partition-based methods enhance search efficiency by probing partial partitions, yet they face two common issues. In the query phase, a common strategy is to probe partitions based on the distance ranks of a query to partition centroids, which inevitably probes irrelevant partitions as it ignores data distribution. In the partition construction phase, all partition-based methods face the boundary problem that separates a query's nearest neighbors to multiple partitions, resulting in a long-tailed kNN distribution and degrading the optimal nprobe (i.e., the number of probing partitions). To address this gap, we propose LIRA, a LearnIng-based queRy-aware pArtition framework. Specifically, we propose a probing model to directly probe the partitions containing the kNN of a query, which can reduce probing waste and allow for query-aware probing with nprobe individually. Moreover, we incorporate the probing model into a learning-based redundancy strategy to mitigate the adverse impact of the long-tailed kNN distribution on search efficiency. Extensive experiments on real-world vector datasets demonstrate the superiority of LIRA in the trade-off among accuracy, latency, and query fan-out. The codes are available at https://github.com/SimoneZeng/LIRA-ANN-search.

LIRA: A Learning-based Query-aware Partition Framework for Large-scale ANN Search

TL;DR

This work tackles inefficiencies in partition-based ANN search caused by probing waste and long-tail NN distributions. It introduces LIRA, a learning-based, query-aware meta-index that directly predicts the NN partitions for each query and employs a learning-based redundancy strategy to duplicate data points into replica partitions, enabling adaptive per-query . Through extensive experiments on five real-world high-dimensional datasets, LIRA consistently improves recall and reduces probe counts compared with IV* and BLISS baselines, with gains amplified at high recall levels. The approach offers practical scalability for large-scale ANN systems by combining a trainable probing model with targeted data redundancy and effective two-level indexing, and code is available for replication at the provided GitHub repository.

Abstract

Approximate nearest neighbor search is fundamental in information retrieval. Previous partition-based methods enhance search efficiency by probing partial partitions, yet they face two common issues. In the query phase, a common strategy is to probe partitions based on the distance ranks of a query to partition centroids, which inevitably probes irrelevant partitions as it ignores data distribution. In the partition construction phase, all partition-based methods face the boundary problem that separates a query's nearest neighbors to multiple partitions, resulting in a long-tailed kNN distribution and degrading the optimal nprobe (i.e., the number of probing partitions). To address this gap, we propose LIRA, a LearnIng-based queRy-aware pArtition framework. Specifically, we propose a probing model to directly probe the partitions containing the kNN of a query, which can reduce probing waste and allow for query-aware probing with nprobe individually. Moreover, we incorporate the probing model into a learning-based redundancy strategy to mitigate the adverse impact of the long-tailed kNN distribution on search efficiency. Extensive experiments on real-world vector datasets demonstrate the superiority of LIRA in the trade-off among accuracy, latency, and query fan-out. The codes are available at https://github.com/SimoneZeng/LIRA-ANN-search.

Paper Structure

This paper contains 23 sections, 4 equations, 15 figures, 3 tables, 1 algorithm.

Figures (15)

  • Figure 1: Example for probing waste. The blue point is a query, and the red points are the centroids of partitions.
  • Figure 2: Extra probing with distance ranking (LEFT) and common phenomenon of long-tail $k$NN (RIGHT).
  • Figure 3: Partition initialization (LEFT), the probing model training (MIDDLE) and learning-based redundancy strategy (RIGHT).
  • Figure 4: Ratio of long-tailed and not long-tailed data points under certain $nprobe^*$ (LEFT). The recall (MIDDLE) and hit rate (RIGHT) of replica partitions among top-M partitions with model output rank or centroids distance rank.
  • Figure 5: Pick and duplicate potential long-tailed data points individually with the probing model is more efficient than using ground truth $k$NN count distribution globally.
  • ...and 10 more figures

Theorems & Definitions (4)

  • Definition 1: $k$NN Count Distribution
  • Definition 2: Recall@k
  • Definition 3: Long-tail Data Point
  • Definition 4: Objective