Towards End-to-End Open Conversational Machine Reading

Sizhe Zhou; Siru Ouyang; Zhuosheng Zhang; Hai Zhao

Towards End-to-End Open Conversational Machine Reading

Sizhe Zhou, Siru Ouyang, Zhuosheng Zhang, Hai Zhao

TL;DR

This work model OR-CMR as a unified text- to-text task in a fully end-to-end style and shows the effectiveness of the proposed end-To-end framework on both sub-tasks by a large margin, achieving new state-of-the-art results.

Abstract

In open-retrieval conversational machine reading (OR-CMR) task, machines are required to do multi-turn question answering given dialogue history and a textual knowledge base. Existing works generally utilize two independent modules to approach this problem's two successive sub-tasks: first with a hard-label decision making and second with a question generation aided by various entailment reasoning methods. Such usual cascaded modeling is vulnerable to error propagation and prevents the two sub-tasks from being consistently optimized. In this work, we instead model OR-CMR as a unified text-to-text task in a fully end-to-end style. Experiments on the ShARC and OR-ShARC dataset show the effectiveness of our proposed end-to-end framework on both sub-tasks by a large margin, achieving new state-of-the-art results. Further ablation studies support that our framework can generalize to different backbone models.

Towards End-to-End Open Conversational Machine Reading

TL;DR

Abstract

Paper Structure (35 sections, 1 equation, 7 figures, 10 tables)

This paper contains 35 sections, 1 equation, 7 figures, 10 tables.

Introduction
Related Work
Conversational Machine Reading
Open-Retrieval CMR
Joint Optimization of CMR
Problem Formulation
Framework
Retriever
Text-to-Text Encoder-Decoder
Input Formulation
Discourse Segmentation.
Output Formulation
Training Objective
Experiments
Experiment Setups
...and 20 more sections

Figures (7)

Figure 1: CMR and OR-CMR Task Overview
Figure 2: The overall framework for our proposed model (bottom right part) compared with the existing ones (bottom left part). Note that the ways of preprocessing the problem setting input vary from model to model, but they are generally similar. And the setting part only shows our preprocessing overview. Also note that [QU], [SEP], [SC], [SEP], [FUQ], [FUA], [SN], [EDU] are added special tokens while [EOS] is the end-of-sequence token for encoder-decoder model.
Figure 3: Evaluation performance of our model under different number of retrieved rule texts on test set.
Figure 4: Evaluation performance of our model under different max generation length on test set.
Figure 5: Classwise accuracy on dev set of each epoh.
...and 2 more figures

Towards End-to-End Open Conversational Machine Reading

TL;DR

Abstract

Towards End-to-End Open Conversational Machine Reading

Authors

TL;DR

Abstract

Table of Contents

Figures (7)