A Surprisingly Simple yet Effective Multi-Query Rewriting Method for Conversational Passage Retrieval
Ivica Kostric, Krisztian Balog
TL;DR
The paper tackles the challenge of conversational passage retrieval where a single neural query rewrite may fail to capture the user’s intent across history. It introduces Conversational Multi-Query Rewriting (CMQR), which generates the top $n$ rewrites per turn via beam search and integrates them into both sparse and dense first-pass retrieval without additional computational cost. In sparse retrieval, CMQR weights terms by the rewrite scores and aggregates across rewrites for robust query expansion and term importance estimation; in dense retrieval, it forms a weighted centroid of rewrite embeddings using the rewrite scores. Evaluated on QReCC, CMQR yields state-of-the-art results across pipelines, with consistent improvements in MRR and competitive performance relative to manual rewrites, demonstrating the practical value of simple, multi-rewrite strategies in conversational IR.
Abstract
Conversational passage retrieval is challenging as it often requires the resolution of references to previous utterances and needs to deal with the complexities of natural language, such as coreference and ellipsis. To address these challenges, pre-trained sequence-to-sequence neural query rewriters are commonly used to generate a single de-contextualized query based on conversation history. Previous research shows that combining multiple query rewrites for the same user utterance has a positive effect on retrieval performance. We propose the use of a neural query rewriter to generate multiple queries and show how to integrate those queries in the passage retrieval pipeline efficiently. The main strength of our approach lies in its simplicity: it leverages how the beam search algorithm works and can produce multiple query rewrites at no additional cost. Our contributions further include devising ways to utilize multi-query rewrites in both sparse and dense first-pass retrieval. We demonstrate that applying our approach on top of a standard passage retrieval pipeline delivers state-of-the-art performance without sacrificing efficiency.
