Neural Passage Quality Estimation for Static Pruning

Xuejun Chang; Debabrata Mishra; Craig Macdonald; Sean MacAvaney

Neural Passage Quality Estimation for Static Pruning

Xuejun Chang, Debabrata Mishra, Craig Macdonald, Sean MacAvaney

TL;DR

This work tackles the problem of reducing neural search costs by pruning passages that are unlikely to satisfy any user query. It formalizes a query-agnostic passage quality signal and compares multiple estimators, finding that a supervised QT5-based approach provides the strongest, most consistent pruning signal. Across several pipelines (lexical, dense, learned sparse, and re-ranking), pruning 25–30% of passages yields statistically equivalent retrieval effectiveness, while also reducing indexing and retrieval costs; smaller QT5 variants further improve efficiency with minimal loss in performance. The study demonstrates transferability to larger corpora and different domains (MSMARCO v2, CORD-19) and discusses practical implications for energy efficiency and cost in AI-powered search, setting the stage for learning-what-to-index strategies and more integrated document/segment pruning.

Abstract

Neural networks -- especially those that use large, pre-trained language models -- have improved search engines in various ways. Most prominently, they can estimate the relevance of a passage or document to a user's query. In this work, we depart from this direction by exploring whether neural networks can effectively predict which of a document's passages are unlikely to be relevant to any query submitted to the search engine. We refer to this query-agnostic estimation of passage relevance as a passage's quality. We find that our novel methods for estimating passage quality allow passage corpora to be pruned considerably while maintaining statistically equivalent effectiveness; our best methods can consistently prune >25% of passages in a corpora, across various retrieval pipelines. Such substantial pruning reduces the operating costs of neural search engines in terms of computing resources, power usage, and carbon footprint -- both when processing queries (thanks to a smaller index size) and when indexing (lightweight models can prune low-quality passages prior to the costly dense or learned sparse encoding step). This work sets the stage for developing more advanced neural "learning-what-to-index" methods.

Neural Passage Quality Estimation for Static Pruning

TL;DR

Abstract

Paper Structure (22 sections, 2 equations, 5 figures, 3 tables)

This paper contains 22 sections, 2 equations, 5 figures, 3 tables.

Introduction
Related Work
Passage Quality Estimation
Preliminaries
Statistical Quality Estimators
Unsupervised Neural Quality Estimators
Latent Neural Quality Estimators
Supervised Neural Quality Estimators
Passage Quality for Static Pruning
Experimental Setup
Datasets
Ranking Pipelines
Passage Quality Estimators
Measures
Results & Analysis
...and 7 more sections

Figures (5)

Figure 1: Example from msmarco_doc_01_1225433927 showing that not all passages within a document are necessarily valuable.
Figure 2: ROC curves for each passage quality estimator, based on a union of all relevant documents in the full MSMARCO dev set, DL 2019, and DL 2020 and excluding all relevant passages from the train set. The figure details the range [0.8,1.0], thereby focussing on the passages most likely to be pruned. The AUC for each estimator is in the legend.
Figure 3: Precision-oriented retrieval effectiveness on four pipelines by the percentage of a corpus pruned using each quality estimator. Effectiveness measurements that are statistically equivalent to the unpruned passage corpus are marked with $\medbullet$. Note that the vertical axis of each plot are scaled to emphasise the effect on each individual model.
Figure 4: Precision-oriented pruning effectiveness of three supervised QT5 model sizes on four pipelines. Effectiveness measurements that are statistically equivalent to the unpruned passage corpus are marked with $\medbullet$. Note that the vertical axis of each plot are scaled to emphasise the effect on each individual model.
Figure 5: Transferability of QT-5-Tiny to two other datasets: MSMARCO v2 (TREC DL 21&22) and CORD19 (TREC COVID). Effectiveness measurements that are statistically equivalent effectiveness to the unpruned corpus are marked with $\medbullet$. Note that the vertical axis of each plot are scaled to emphasise the effect on each individual model.

Neural Passage Quality Estimation for Static Pruning

TL;DR

Abstract

Neural Passage Quality Estimation for Static Pruning

Authors

TL;DR

Abstract

Table of Contents

Figures (5)