Table of Contents
Fetching ...

Few-Shot Generative Conversational Query Rewriting

Shi Yu, Jiahua Liu, Jingqin Yang, Chenyan Xiong, Paul Bennett, Jianfeng Gao, Zhiyuan Liu

TL;DR

This paper tackles conversational query rewriting by enabling a GPT-2 rewriter to produce de-contextualized queries from conversational turns with minimal manual labeling. It introduces two weak-supervision strategies—rule-based omission/coreference simulation and a self-supervised query simplifier—to leverage abundant ad hoc search sessions. The approach delivers state-of-the-art results on the TREC CAsT 2019 benchmark in few-shot settings and maintains competitive zero-shot performance, illustrating data-efficient learning for complex, multi-turn queries. The work also analyzes the model's ability to learn task syntax and manage long-range coreferences, providing practical insights for deploying conversational IR systems.

Abstract

Conversational query rewriting aims to reformulate a concise conversational query to a fully specified, context-independent query that can be effectively handled by existing information retrieval systems. This paper presents a few-shot generative approach to conversational query rewriting. We develop two methods, based on rules and self-supervised learning, to generate weak supervision data using large amounts of ad hoc search sessions, and to fine-tune GPT-2 to rewrite conversational queries. On the TREC Conversational Assistance Track, our weakly supervised GPT-2 rewriter improves the state-of-the-art ranking accuracy by 12%, only using very limited amounts of manual query rewrites. In the zero-shot learning setting, the rewriter still gives a comparable result to previous state-of-the-art systems. Our analyses reveal that GPT-2 effectively picks up the task syntax and learns to capture context dependencies, even for hard cases that involve group references and long-turn dependencies.

Few-Shot Generative Conversational Query Rewriting

TL;DR

This paper tackles conversational query rewriting by enabling a GPT-2 rewriter to produce de-contextualized queries from conversational turns with minimal manual labeling. It introduces two weak-supervision strategies—rule-based omission/coreference simulation and a self-supervised query simplifier—to leverage abundant ad hoc search sessions. The approach delivers state-of-the-art results on the TREC CAsT 2019 benchmark in few-shot settings and maintains competitive zero-shot performance, illustrating data-efficient learning for complex, multi-turn queries. The work also analyzes the model's ability to learn task syntax and manage long-range coreferences, providing practical insights for deploying conversational IR systems.

Abstract

Conversational query rewriting aims to reformulate a concise conversational query to a fully specified, context-independent query that can be effectively handled by existing information retrieval systems. This paper presents a few-shot generative approach to conversational query rewriting. We develop two methods, based on rules and self-supervised learning, to generate weak supervision data using large amounts of ad hoc search sessions, and to fine-tune GPT-2 to rewrite conversational queries. On the TREC Conversational Assistance Track, our weakly supervised GPT-2 rewriter improves the state-of-the-art ranking accuracy by 12%, only using very limited amounts of manual query rewrites. In the zero-shot learning setting, the rewriter still gives a comparable result to previous state-of-the-art systems. Our analyses reveal that GPT-2 effectively picks up the task syntax and learns to capture context dependencies, even for hard cases that involve group references and long-turn dependencies.

Paper Structure

This paper contains 9 sections, 4 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Performances in Different Scenarios. X-axis in (b) shows turn depths and Y-axis is NDCG@3.
  • Figure 2: Performances of GPT-2 with different fine-tuning amounts: conversational sessions with manual rewrites (a) and fine-tuning steps (b). The Y-axes show the corresponding metric in (a) and (b).