Table of Contents
Fetching ...

Generative Query Reformulation Using Ensemble Prompting, Document Fusion, and Relevance Feedback

Kaustubh D. Dhole, Ramraj Chandradevan, Eugene Agichtein

TL;DR

The paper tackles query reformulation by leveraging ensemble zero-shot prompting with large language models. It introduces GenQREnsemble and GenQRFusion for pre-retrieval QR, along with post-retrieval relevance-feedback variants GenQREnsemble-RF and GenQRFusion-RF that incorporate feedback from users or critics. Across four IR benchmarks, the approach achieves up to 18% relative gains in nDCG@10 pre-retrieval and up to 9% post-retrieval, setting new state-of-the-art results for automated QR. The authors further investigate domain-specific instructions, feedback document influence, query filtering, and interpretable reformulations, highlighting practical implications for retrieval systems and future research directions.

Abstract

Query Reformulation (QR) is a set of techniques used to transform a user's original search query to a text that better aligns with the user's intent and improves their search experience. Recently, zero-shot QR has been a promising approach due to its ability to exploit knowledge inherent in large language models. Inspired by the success of ensemble prompting strategies which have benefited other tasks, we investigate if they can improve query reformulation. In this context, we propose two ensemble-based prompting techniques, GenQREnsemble and GenQRFusion which leverage paraphrases of a zero-shot instruction to generate multiple sets of keywords to improve retrieval performance ultimately. We further introduce their post-retrieval variants to incorporate relevance feedback from a variety of sources, including an oracle simulating a human user and a "critic" LLM. We demonstrate that an ensemble of query reformulations can improve retrieval effectiveness by up to 18% on nDCG@10 in pre-retrieval settings and 9% on post-retrieval settings on multiple benchmarks, outperforming all previously reported SOTA results. We perform subsequent analyses to investigate the effects of feedback documents, incorporate domain-specific instructions, filter reformulations, and generate fluent reformulations that might be more beneficial to human searchers. Together, the techniques and the results presented in this paper establish a new state of the art in automated query reformulation for retrieval and suggest promising directions for future research.

Generative Query Reformulation Using Ensemble Prompting, Document Fusion, and Relevance Feedback

TL;DR

The paper tackles query reformulation by leveraging ensemble zero-shot prompting with large language models. It introduces GenQREnsemble and GenQRFusion for pre-retrieval QR, along with post-retrieval relevance-feedback variants GenQREnsemble-RF and GenQRFusion-RF that incorporate feedback from users or critics. Across four IR benchmarks, the approach achieves up to 18% relative gains in nDCG@10 pre-retrieval and up to 9% post-retrieval, setting new state-of-the-art results for automated QR. The authors further investigate domain-specific instructions, feedback document influence, query filtering, and interpretable reformulations, highlighting practical implications for retrieval systems and future research directions.

Abstract

Query Reformulation (QR) is a set of techniques used to transform a user's original search query to a text that better aligns with the user's intent and improves their search experience. Recently, zero-shot QR has been a promising approach due to its ability to exploit knowledge inherent in large language models. Inspired by the success of ensemble prompting strategies which have benefited other tasks, we investigate if they can improve query reformulation. In this context, we propose two ensemble-based prompting techniques, GenQREnsemble and GenQRFusion which leverage paraphrases of a zero-shot instruction to generate multiple sets of keywords to improve retrieval performance ultimately. We further introduce their post-retrieval variants to incorporate relevance feedback from a variety of sources, including an oracle simulating a human user and a "critic" LLM. We demonstrate that an ensemble of query reformulations can improve retrieval effectiveness by up to 18% on nDCG@10 in pre-retrieval settings and 9% on post-retrieval settings on multiple benchmarks, outperforming all previously reported SOTA results. We perform subsequent analyses to investigate the effects of feedback documents, incorporate domain-specific instructions, filter reformulations, and generate fluent reformulations that might be more beneficial to human searchers. Together, the techniques and the results presented in this paper establish a new state of the art in automated query reformulation for retrieval and suggest promising directions for future research.
Paper Structure (24 sections, 13 figures, 6 tables)

This paper contains 24 sections, 13 figures, 6 tables.

Figures (13)

  • Figure 1: The complete flow and algorithm of GenQREnsemble and GenQREnsemble-RF (dotted).
  • Figure 2: The complete flow of GenQRFusion. GenQRFusion-RF is shown with the dotted line.
  • Figure 3: The $N$ reformulation instructions used for GenQREnsemble and GenQRFusion
  • Figure 4: The prompt used for all the Llama-2 Query reformulators.
  • Figure 5: Query Reformulations generated from the Single Instruction Setting and the Ensemble Setting. The grey highlights depict the terms present in the highest relevant (gold) documents.
  • ...and 8 more figures