Conversational Query Reformulation with the Guidance of Retrieved Documents
Jeonghyun Park, Hwanhee Lee
TL;DR
GuideCQR tackles conversational query reformulation by exploiting signals in initially retrieved documents to craft retriever-friendly queries. The method retrieves an initial document set using a baseline $Q_{baseline}$ derived from dialogue context, re-ranks to form a guiding subset, and then extracts keywords $K$ and generates expected answers $A$ from these documents. A filtering stage computes $FilterScore$ from $QueryScore$ and $HistoryScore$ to prune signals and unifies the remaining $K_{filtered}$ and $A_{filtered}$ with $Q_{baseline}$ to produce $Q_{final}$. Empirical evidence across CAsT-19, CAsT-20, and QReCC demonstrates state-of-the-art performance and robustness, and analyses show the approach effectively leverages signals while remaining adaptable to different rewriting setups. This work advances practical conversational search by aligning reformulation with retriever objectives and offers a scalable framework for integrating document-derived guidance into query rewriting.
Abstract
Conversational search seeks to retrieve relevant passages for the given questions in conversational question answering. Conversational Query Reformulation (CQR) improves conversational search by refining the original queries into de-contextualized forms to resolve the issues in the original queries, such as omissions and coreferences. Previous CQR methods focus on imitating human written queries which may not always yield meaningful search results for the retriever. In this paper, we introduce GuideCQR, a framework that refines queries for CQR by leveraging key information from the initially retrieved documents. Specifically, GuideCQR extracts keywords and generates expected answers from the retrieved documents, then unifies them with the queries after filtering to add useful information that enhances the search process. Experimental results demonstrate that our proposed method achieves state-of-the-art performance across multiple datasets, outperforming previous CQR methods. Additionally, we show that GuideCQR can get additional performance gains in conversational search using various types of queries, even for queries written by humans.
