Table of Contents
Fetching ...

Open-Retrieval Conversational Question Answering

Chen Qu, Liu Yang, Cen Chen, Minghui Qiu, W. Bruce Croft, Mohit Iyyer

TL;DR

The paper introduces open-retrieval conversational question answering (ORConvQA) and the OR-QuAC dataset, built by integrating QuAC with CANARD rewrites and a vast Wikipedia passage collection to enable retrieval-before-answer tasks. It presents an end-to-end Transformer-based system with a learnable retriever, a reranker, and a reader, all incorporating history modeling; training proceeds via retriever pretraining and concurrent multi-task learning. Results show a strong need for a learnable retriever, substantial gains from incorporating dialog history across components, and a regularization role for the reranker, with initial questions found to be particularly informative. The work provides new insights into ORConvQA design and establishes a dataset and methodology to advance open-retrieval conversational search.

Abstract

Conversational search is one of the ultimate goals of information retrieval. Recent research approaches conversational search by simplified settings of response ranking and conversational question answering, where an answer is either selected from a given candidate set or extracted from a given passage. These simplifications neglect the fundamental role of retrieval in conversational search. To address this limitation, we introduce an open-retrieval conversational question answering (ORConvQA) setting, where we learn to retrieve evidence from a large collection before extracting answers, as a further step towards building functional conversational search systems. We create a dataset, OR-QuAC, to facilitate research on ORConvQA. We build an end-to-end system for ORConvQA, featuring a retriever, a reranker, and a reader that are all based on Transformers. Our extensive experiments on OR-QuAC demonstrate that a learnable retriever is crucial for ORConvQA. We further show that our system can make a substantial improvement when we enable history modeling in all system components. Moreover, we show that the reranker component contributes to the model performance by providing a regularization effect. Finally, further in-depth analyses are performed to provide new insights into ORConvQA.

Open-Retrieval Conversational Question Answering

TL;DR

The paper introduces open-retrieval conversational question answering (ORConvQA) and the OR-QuAC dataset, built by integrating QuAC with CANARD rewrites and a vast Wikipedia passage collection to enable retrieval-before-answer tasks. It presents an end-to-end Transformer-based system with a learnable retriever, a reranker, and a reader, all incorporating history modeling; training proceeds via retriever pretraining and concurrent multi-task learning. Results show a strong need for a learnable retriever, substantial gains from incorporating dialog history across components, and a regularization role for the reranker, with initial questions found to be particularly informative. The work provides new insights into ORConvQA design and establishes a dataset and methodology to advance open-retrieval conversational search.

Abstract

Conversational search is one of the ultimate goals of information retrieval. Recent research approaches conversational search by simplified settings of response ranking and conversational question answering, where an answer is either selected from a given candidate set or extracted from a given passage. These simplifications neglect the fundamental role of retrieval in conversational search. To address this limitation, we introduce an open-retrieval conversational question answering (ORConvQA) setting, where we learn to retrieve evidence from a large collection before extracting answers, as a further step towards building functional conversational search systems. We create a dataset, OR-QuAC, to facilitate research on ORConvQA. We build an end-to-end system for ORConvQA, featuring a retriever, a reranker, and a reader that are all based on Transformers. Our extensive experiments on OR-QuAC demonstrate that a learnable retriever is crucial for ORConvQA. We further show that our system can make a substantial improvement when we enable history modeling in all system components. Moreover, we show that the reranker component contributes to the model performance by providing a regularization effect. Finally, further in-depth analyses are performed to provide new insights into ORConvQA.

Paper Structure

This paper contains 30 sections, 19 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: A partial OR-QuAC dialog and example relevant passages retrieved from the collection by TF-IDF.
  • Figure 2: Architecture of our end-to-end ORConvQA model. The input is the current question $q_k$, all history questions $\{q_i\}_{i=1}^{k-1}$, and a history window size $w$. The retriever first retrieves top-$K$ relevant passages from the collection and generates retriever scores $S_{rt}$. The reranker and reader then rerank and read the top passages to produce an answer span for each passage and generate reranker and reader scores, $S_{rr}$ and $S_{rd}$. The system outputs the answer span with the highest overall score $S$.
  • Figure 3: Retriever pretraining.
  • Figure 4: Impact of history window size $w$.
  • Figure 5: Impact of # samples to update retriever $K_{rt}$.