Words or Characters? Fine-grained Gating for Reading Comprehension

Zhilin Yang; Bhuwan Dhingra; Ye Yuan; Junjie Hu; William W. Cohen; Ruslan Salakhutdinov

Words or Characters? Fine-grained Gating for Reading Comprehension

Zhilin Yang, Bhuwan Dhingra, Ye Yuan, Junjie Hu, William W. Cohen, Ruslan Salakhutdinov

TL;DR

This paper tackles the challenge of effectively combining word- and character-level token representations for reading comprehension. It introduces a fine-grained, token-property conditioned gating mechanism and extends the idea to model document-query interactions via a gated attention framework. The approach yields substantial improvements across datasets, achieving state-of-the-art results on the Children's Book Test and strong performance on SQuAD and Who Did What, while also showing gains on Twitter tagging. Visualization confirms intuitive gating behavior: rare or morphologically rich tokens leverage character-level information, while frequent tokens rely more on word-level representations, indicating robust, interpretable dynamics with practical impact for high-level NLP tasks.

Abstract

Previous work combines word-level and character-level representations using concatenation or scalar weighting, which is suboptimal for high-level tasks like reading comprehension. We present a fine-grained gating mechanism to dynamically combine word-level and character-level representations based on properties of the words. We also extend the idea of fine-grained gating to modeling the interaction between questions and paragraphs for reading comprehension. Experiments show that our approach can improve the performance on reading comprehension tasks, achieving new state-of-the-art results on the Children's Book Test dataset. To demonstrate the generality of our gating mechanism, we also show improved results on a social media tag prediction task.

Words or Characters? Fine-grained Gating for Reading Comprehension

TL;DR

Abstract

Words or Characters? Fine-grained Gating for Reading Comprehension

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)