Table of Contents
Fetching ...

Conversational Query Reformulation with the Guidance of Retrieved Documents

Jeonghyun Park, Hwanhee Lee

TL;DR

GuideCQR tackles conversational query reformulation by exploiting signals in initially retrieved documents to craft retriever-friendly queries. The method retrieves an initial document set using a baseline $Q_{baseline}$ derived from dialogue context, re-ranks to form a guiding subset, and then extracts keywords $K$ and generates expected answers $A$ from these documents. A filtering stage computes $FilterScore$ from $QueryScore$ and $HistoryScore$ to prune signals and unifies the remaining $K_{filtered}$ and $A_{filtered}$ with $Q_{baseline}$ to produce $Q_{final}$. Empirical evidence across CAsT-19, CAsT-20, and QReCC demonstrates state-of-the-art performance and robustness, and analyses show the approach effectively leverages signals while remaining adaptable to different rewriting setups. This work advances practical conversational search by aligning reformulation with retriever objectives and offers a scalable framework for integrating document-derived guidance into query rewriting.

Abstract

Conversational search seeks to retrieve relevant passages for the given questions in conversational question answering. Conversational Query Reformulation (CQR) improves conversational search by refining the original queries into de-contextualized forms to resolve the issues in the original queries, such as omissions and coreferences. Previous CQR methods focus on imitating human written queries which may not always yield meaningful search results for the retriever. In this paper, we introduce GuideCQR, a framework that refines queries for CQR by leveraging key information from the initially retrieved documents. Specifically, GuideCQR extracts keywords and generates expected answers from the retrieved documents, then unifies them with the queries after filtering to add useful information that enhances the search process. Experimental results demonstrate that our proposed method achieves state-of-the-art performance across multiple datasets, outperforming previous CQR methods. Additionally, we show that GuideCQR can get additional performance gains in conversational search using various types of queries, even for queries written by humans.

Conversational Query Reformulation with the Guidance of Retrieved Documents

TL;DR

GuideCQR tackles conversational query reformulation by exploiting signals in initially retrieved documents to craft retriever-friendly queries. The method retrieves an initial document set using a baseline derived from dialogue context, re-ranks to form a guiding subset, and then extracts keywords and generates expected answers from these documents. A filtering stage computes from and to prune signals and unifies the remaining and with to produce . Empirical evidence across CAsT-19, CAsT-20, and QReCC demonstrates state-of-the-art performance and robustness, and analyses show the approach effectively leverages signals while remaining adaptable to different rewriting setups. This work advances practical conversational search by aligning reformulation with retriever objectives and offers a scalable framework for integrating document-derived guidance into query rewriting.

Abstract

Conversational search seeks to retrieve relevant passages for the given questions in conversational question answering. Conversational Query Reformulation (CQR) improves conversational search by refining the original queries into de-contextualized forms to resolve the issues in the original queries, such as omissions and coreferences. Previous CQR methods focus on imitating human written queries which may not always yield meaningful search results for the retriever. In this paper, we introduce GuideCQR, a framework that refines queries for CQR by leveraging key information from the initially retrieved documents. Specifically, GuideCQR extracts keywords and generates expected answers from the retrieved documents, then unifies them with the queries after filtering to add useful information that enhances the search process. Experimental results demonstrate that our proposed method achieves state-of-the-art performance across multiple datasets, outperforming previous CQR methods. Additionally, we show that GuideCQR can get additional performance gains in conversational search using various types of queries, even for queries written by humans.
Paper Structure (38 sections, 9 equations, 3 figures, 16 tables)

This paper contains 38 sections, 9 equations, 3 figures, 16 tables.

Figures (3)

  • Figure 1: Example queries for ConvQA where the reformulated query achieves a higher MRR score by effectively extracting clues from the initially retrieved documents, compared to the base query.
  • Figure 2: Overall framework of GuideCQR: For easier understanding, we only visualize top-1 ranked document and present keywords augmented from the top-1 document and 3 answer pairs from the top-3 documents.
  • Figure 3: Failure case of GuideCQR for reformulating conversational query, where the system generates irrelevant keywords and answers with regard to the $Q_{baseline}$.