Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension
Yichong Xu, Xiaodong Liu, Yelong Shen, Jingjing Liu, Jianfeng Gao
TL;DR
The paper tackles the problem of building a robust machine reading comprehension model that generalizes across domains by using multi-task learning with carefully designed data selection. It introduces MT-SAN, a SAN-based architecture enhanced with highway layers and EMA, and two training strategies: a simple mixture-ratio and a novel sample re-weighting scheme that assigns per-sample importance to auxiliary data. Empirical results on SQuAD, NewsQA, MS MARCO, and WDW show consistent improvements over single-task baselines, with substantial gains on NewsQA where MT-SAN even surpasses human performance on the dev set. The work demonstrates that targeted, sample-level re-weighting and multi-task training can yield robust MRC performance across diverse datasets, and suggests future integration with larger language models for further gains.
Abstract
We propose a multi-task learning framework to learn a joint Machine Reading Comprehension (MRC) model that can be applied to a wide range of MRC tasks in different domains. Inspired by recent ideas of data selection in machine translation, we develop a novel sample re-weighting scheme to assign sample-specific weights to the loss. Empirical study shows that our approach can be applied to many existing MRC models. Combined with contextual representations from pre-trained language models (such as ELMo), we achieve new state-of-the-art results on a set of MRC benchmark datasets. We release our code at https://github.com/xycforgithub/MultiTask-MRC.
