Table of Contents
Fetching ...

A Deep Learning Approach for Selective Relevance Feedback

Suchana Datta, Debasis Ganguly, Sean MacAvaney, Derek Greene

TL;DR

This paper tackles the problem of when to apply pseudo-relevance feedback (PRF), which can improve retrieval but often causes query drift. It proposes a fully data-driven, supervised selective PRF framework using a transformer-based encoding (Deep-SRF-BERT) to learn a PRF decision function, and integrates model confidence to softly fuse rankings from original and expanded queries. The method is shown to yield consistent gains across sparse and dense retrieval models, and its decision function generalizes across different PRF approaches, approaching oracle-like performance. The approach reduces unnecessary PRF usage and provides a practical, model-agnostic solution for selective relevance feedback in modern IR pipelines.

Abstract

Pseudo-relevance feedback (PRF) can enhance average retrieval effectiveness over a sufficiently large number of queries. However, PRF often introduces a drift into the original information need, thus hurting the retrieval effectiveness of several queries. While a selective application of PRF can potentially alleviate this issue, previous approaches have largely relied on unsupervised or feature-based learning to determine whether a query should be expanded. In contrast, we revisit the problem of selective PRF from a deep learning perspective, presenting a model that is entirely data-driven and trained in an end-to-end manner. The proposed model leverages a transformer-based bi-encoder architecture. Additionally, to further improve retrieval effectiveness with this selective PRF approach, we make use of the model's confidence estimates to combine the information from the original and expanded queries. In our experiments, we apply this selective feedback on a number of different combinations of ranking and feedback models, and show that our proposed approach consistently improves retrieval effectiveness for both sparse and dense ranking models, with the feedback models being either sparse, dense or generative.

A Deep Learning Approach for Selective Relevance Feedback

TL;DR

This paper tackles the problem of when to apply pseudo-relevance feedback (PRF), which can improve retrieval but often causes query drift. It proposes a fully data-driven, supervised selective PRF framework using a transformer-based encoding (Deep-SRF-BERT) to learn a PRF decision function, and integrates model confidence to softly fuse rankings from original and expanded queries. The method is shown to yield consistent gains across sparse and dense retrieval models, and its decision function generalizes across different PRF approaches, approaching oracle-like performance. The approach reduces unnecessary PRF usage and provides a practical, model-agnostic solution for selective relevance feedback in modern IR pipelines.

Abstract

Pseudo-relevance feedback (PRF) can enhance average retrieval effectiveness over a sufficiently large number of queries. However, PRF often introduces a drift into the original information need, thus hurting the retrieval effectiveness of several queries. While a selective application of PRF can potentially alleviate this issue, previous approaches have largely relied on unsupervised or feature-based learning to determine whether a query should be expanded. In contrast, we revisit the problem of selective PRF from a deep learning perspective, presenting a model that is entirely data-driven and trained in an end-to-end manner. The proposed model leverages a transformer-based bi-encoder architecture. Additionally, to further improve retrieval effectiveness with this selective PRF approach, we make use of the model's confidence estimates to combine the information from the original and expanded queries. In our experiments, we apply this selective feedback on a number of different combinations of ranking and feedback models, and show that our proposed approach consistently improves retrieval effectiveness for both sparse and dense ranking models, with the feedback models being either sparse, dense or generative.
Paper Structure (27 sections, 10 equations, 3 figures, 3 tables)

This paper contains 27 sections, 10 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Relative changes in AP, i.e., (AP(post-fdbk) - AP(pre-fdbk))/AP(pre-fdbk), for TREC DL'20 queries. We observe that many queries are negatively impacted by PRF (bars below the x-axis).
  • Figure 2: A schematic diagram of selective feedback. The main contribution of this paper is a supervised data-driven approach towards realising the decision function.
  • Figure 3: Training of a transformer based query-document architecture with shared parameters for selective PRF. During inference, only the left part of the network is used to output whether to apply PRF or not for a given query.