Dynamic Integration of Background Knowledge in Neural NLU Systems
Dirk Weissenborn, Tomáš Kočiský, Chris Dyer
TL;DR
This work tackles the limitation of static background knowledge in neural NLU by introducing a dynamic reading-based architecture that ingests external knowledge in natural language (ConceptNet and Wikipedia) and refines word embeddings in multiple reading steps. The core idea is to replace static embeddings with context-dependent ones $\mathbf{E}^\ell$ obtained through an incremental reading process, enabling end-to-end training with task-specific models for DQA and RTE. Empirical results show robust improvements across benchmarks, including a new state-of-the-art on TriviaQA, and qualitative analyses demonstrate semantically meaningful use of external knowledge and the ability to perform counterfactual inferences. This approach offers a general, task-agnostic mechanism to augment neural NLU systems with flexible, up-to-date background knowledge.
Abstract
Common-sense and background knowledge is required to understand natural language, but in most neural natural language understanding (NLU) systems, this knowledge must be acquired from training corpora during learning, and then it is static at test time. We introduce a new architecture for the dynamic integration of explicit background knowledge in NLU models. A general-purpose reading module reads background knowledge in the form of free-text statements (together with task-specific text inputs) and yields refined word representations to a task-specific NLU architecture that reprocesses the task inputs with these representations. Experiments on document question answering (DQA) and recognizing textual entailment (RTE) demonstrate the effectiveness and flexibility of the approach. Analysis shows that our model learns to exploit knowledge in a semantically appropriate way.
