Table of Contents
Fetching ...

Approximate Cluster-Based Sparse Document Retrieval with Segmented Maximum Term Weights

Yifan Qiao, Shanxiu He, Yingrui Yang, Parker Carlson, Tao Yang

TL;DR

This work addresses efficient sparse document retrieval by integrating cluster-based pruning with segmented maximum term weights to tighten rank-bound estimates. The proposed ASC framework introduces a two-parameter control, (μ, η), and uses random segment partitions to achieve a probabilistic rank-safeness guarantee while adapting pruning aggressiveness to bound estimation tightness. The approach demonstrates improved speed with competitive or better relevance on MS MARCO and BEIR across SPLADE, uniCOIL, and LexMAE models, and is shown to be compatible with other efficiency techniques like time budgets and static pruning. Overall, ASC provides a practical, safeness-aware mechanism to prune clusters and documents, yielding meaningful latency reductions with minimal degradation in top-k quality.

Abstract

This paper revisits cluster-based retrieval that partitions the inverted index into multiple groups and skips the index partially at cluster and document levels during online inference using a learned sparse representation. It proposes an approximate search scheme with two parameters to control the rank-safeness competitiveness of pruning with segmented maximum term weights within each cluster. Cluster-level maximum weight segmentation allows an improvement in the rank score bound estimation and threshold-based pruning to be approximately adaptive to bound estimation tightness, resulting in better relevance and efficiency. The experiments with MS MARCO passage ranking and BEIR datasets demonstrate the usefulness of the proposed scheme with a comparison to the baselines. This paper presents the design of this approximate retrieval scheme with rank-safeness analysis, compares clustering and segmentation options, and reports evaluation results.

Approximate Cluster-Based Sparse Document Retrieval with Segmented Maximum Term Weights

TL;DR

This work addresses efficient sparse document retrieval by integrating cluster-based pruning with segmented maximum term weights to tighten rank-bound estimates. The proposed ASC framework introduces a two-parameter control, (μ, η), and uses random segment partitions to achieve a probabilistic rank-safeness guarantee while adapting pruning aggressiveness to bound estimation tightness. The approach demonstrates improved speed with competitive or better relevance on MS MARCO and BEIR across SPLADE, uniCOIL, and LexMAE models, and is shown to be compatible with other efficiency techniques like time budgets and static pruning. Overall, ASC provides a practical, safeness-aware mechanism to prune clusters and documents, yielding meaningful latency reductions with minimal degradation in top-k quality.

Abstract

This paper revisits cluster-based retrieval that partitions the inverted index into multiple groups and skips the index partially at cluster and document levels during online inference using a learned sparse representation. It proposes an approximate search scheme with two parameters to control the rank-safeness competitiveness of pruning with segmented maximum term weights within each cluster. Cluster-level maximum weight segmentation allows an improvement in the rank score bound estimation and threshold-based pruning to be approximately adaptive to bound estimation tightness, resulting in better relevance and efficiency. The experiments with MS MARCO passage ranking and BEIR datasets demonstrate the usefulness of the proposed scheme with a comparison to the baselines. This paper presents the design of this approximate retrieval scheme with rank-safeness analysis, compares clustering and segmentation options, and reports evaluation results.
Paper Structure (14 sections, 9 equations, 6 figures, 7 tables)

This paper contains 14 sections, 9 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: Flow of ASC: approximate retrieval with segmented cluster-level maximum term weights
  • Figure 2: The average ratio of the actual and estimated cluster bounds with Formula (\ref{['eq:clusterbound']}) on MS MARCO passages
  • Figure 3: Recall vs. latency of Anytime$^*$ on SPLADE with no time budget when $\mu$ varies from 0.3, 0.5, 0.7, 0.9, to 1 for each fixed number of clusters. Retrieval depth $k=1000$.
  • Figure 4: A pruning example of Anytime, Anytime$^*$, and ASC
  • Figure 5: The correlation between bound estimation tightness and average $AvgSBound(C_i)/MaxSBound(C_i)$ for MS MARCO passage clusters
  • ...and 1 more figures