Table of Contents
Fetching ...

R$^3$: Reinforced Reader-Ranker for Open-Domain Question Answering

Shuohang Wang, Mo Yu, Xiaoxiao Guo, Zhiguo Wang, Tim Klinger, Wei Zhang, Shiyu Chang, Gerald Tesauro, Bowen Zhou, Jing Jiang

TL;DR

The paper tackles open-domain QA by integrating information retrieval with a neural Ranker-Reader. It introduces R^3, which jointly trains a passage Ranker (via reinforcement learning) and a Reader to extract answers from the top passages. The approach yields state-of-the-art results on multiple open-domain QA datasets, outperforming baselines and non-RL counterparts. This work demonstrates the value of end-to-end optimization of passage selection and answer extraction in open-domain settings.

Abstract

In recent years researchers have achieved considerable success applying neural network methods to question answering (QA). These approaches have achieved state of the art results in simplified closed-domain settings such as the SQuAD (Rajpurkar et al., 2016) dataset, which provides a pre-selected passage, from which the answer to a given question may be extracted. More recently, researchers have begun to tackle open-domain QA, in which the model is given a question and access to a large corpus (e.g., wikipedia) instead of a pre-selected passage (Chen et al., 2017a). This setting is more complex as it requires large-scale search for relevant passages by an information retrieval component, combined with a reading comprehension model that "reads" the passages to generate an answer to the question. Performance in this setting lags considerably behind closed-domain performance. In this paper, we present a novel open-domain QA system called Reinforced Ranker-Reader $(R^3)$, based on two algorithmic innovations. First, we propose a new pipeline for open-domain QA with a Ranker component, which learns to rank retrieved passages in terms of likelihood of generating the ground-truth answer to a given question. Second, we propose a novel method that jointly trains the Ranker along with an answer-generation Reader model, based on reinforcement learning. We report extensive experimental results showing that our method significantly improves on the state of the art for multiple open-domain QA datasets.

R$^3$: Reinforced Reader-Ranker for Open-Domain Question Answering

TL;DR

The paper tackles open-domain QA by integrating information retrieval with a neural Ranker-Reader. It introduces R^3, which jointly trains a passage Ranker (via reinforcement learning) and a Reader to extract answers from the top passages. The approach yields state-of-the-art results on multiple open-domain QA datasets, outperforming baselines and non-RL counterparts. This work demonstrates the value of end-to-end optimization of passage selection and answer extraction in open-domain settings.

Abstract

In recent years researchers have achieved considerable success applying neural network methods to question answering (QA). These approaches have achieved state of the art results in simplified closed-domain settings such as the SQuAD (Rajpurkar et al., 2016) dataset, which provides a pre-selected passage, from which the answer to a given question may be extracted. More recently, researchers have begun to tackle open-domain QA, in which the model is given a question and access to a large corpus (e.g., wikipedia) instead of a pre-selected passage (Chen et al., 2017a). This setting is more complex as it requires large-scale search for relevant passages by an information retrieval component, combined with a reading comprehension model that "reads" the passages to generate an answer to the question. Performance in this setting lags considerably behind closed-domain performance. In this paper, we present a novel open-domain QA system called Reinforced Ranker-Reader , based on two algorithmic innovations. First, we propose a new pipeline for open-domain QA with a Ranker component, which learns to rank retrieved passages in terms of likelihood of generating the ground-truth answer to a given question. Second, we propose a novel method that jointly trains the Ranker along with an answer-generation Reader model, based on reinforcement learning. We report extensive experimental results showing that our method significantly improves on the state of the art for multiple open-domain QA datasets.

Paper Structure

This paper contains 16 sections, 14 equations, 1 figure, 7 tables, 1 algorithm.

Figures (1)

  • Figure 1: Overview of training our model, comprising a Ranker and a Reader based on Match-LSTM as shown on the right side. The Ranker selects a passage $\tau$ and the Reader predicts the start and end positions of the answer in $\tau$. The reward for the Ranker depends on similarity of the extracted answer with the ground-truth answer $\mathbf{a}^g$. To accelerate Reader convergence, we also sample several negative passages without ground-truth answer.