Conversational Question Reformulation via Sequence-to-Sequence Architectures and Pretrained Language Models
Sheng-Chieh Lin, Jheng-Hong Yang, Rodrigo Nogueira, Ming-Feng Tsai, Chuan-Ju Wang, Jimmy Lin
TL;DR
The paper addresses conversational question reformulation (CQR) by relaxing the independence assumption of maximum likelihood estimation through pretrained language models. It formulates CQR as generating context-aware rewrites from the current query and dialogue history using sequence-to-sequence architectures, with fine-tuned PLMs including BERT, GPT-2, UniLM, and especially T5. The main finding is that T5-base achieves state-of-the-art BLEU on CANARD and outperforms others on CAsT, approaching human performance in-domain and delivering strong cross-domain transfer with decoding strategies such as beam search. This work demonstrates a practical, parameter-efficient approach for improving open-domain conversational QA and conversational search via context-aware reformulation.
Abstract
This paper presents an empirical study of conversational question reformulation (CQR) with sequence-to-sequence architectures and pretrained language models (PLMs). We leverage PLMs to address the strong token-to-token independence assumption made in the common objective, maximum likelihood estimation, for the CQR task. In CQR benchmarks of task-oriented dialogue systems, we evaluate fine-tuned PLMs on the recently-introduced CANARD dataset as an in-domain task and validate the models using data from the TREC 2019 CAsT Track as an out-domain task. Examining a variety of architectures with different numbers of parameters, we demonstrate that the recent text-to-text transfer transformer (T5) achieves the best results both on CANARD and CAsT with fewer parameters, compared to similar transformer architectures.
