Table of Contents
Fetching ...

Natural Language Inference over Interaction Space

Yichen Gong, Heng Luo, Jian Zhang

TL;DR

The paper introduces Interactive Inference Network (IIN) and its densely interactive instantiation (DIIN) for Natural Language Inference, arguing that semantic information is embedded in the cross-sentence interaction space via an interaction tensor. DIIN combines rich token embeddings (word, character, POS, EM), separate sentence encodings, a word-by-word interaction tensor, and DenseNet-based feature extraction to produce a robust NLI classifier. Across SNLI, MultiNLI, and Quora datasets, DIIN achieves state-of-the-art results, with ablations demonstrating the critical roles of EM features, the interaction tensor, and dense inter-layer connections. The work suggests promising future directions including integrating external commonsense knowledge to further enhance cross-sentence understanding.

Abstract

Natural Language Inference (NLI) task requires an agent to determine the logical relationship between a natural language premise and a natural language hypothesis. We introduce Interactive Inference Network (IIN), a novel class of neural network architectures that is able to achieve high-level understanding of the sentence pair by hierarchically extracting semantic features from interaction space. We show that an interaction tensor (attention weight) contains semantic information to solve natural language inference, and a denser interaction tensor contains richer semantic information. One instance of such architecture, Densely Interactive Inference Network (DIIN), demonstrates the state-of-the-art performance on large scale NLI copora and large-scale NLI alike corpus. It's noteworthy that DIIN achieve a greater than 20% error reduction on the challenging Multi-Genre NLI (MultiNLI) dataset with respect to the strongest published system.

Natural Language Inference over Interaction Space

TL;DR

The paper introduces Interactive Inference Network (IIN) and its densely interactive instantiation (DIIN) for Natural Language Inference, arguing that semantic information is embedded in the cross-sentence interaction space via an interaction tensor. DIIN combines rich token embeddings (word, character, POS, EM), separate sentence encodings, a word-by-word interaction tensor, and DenseNet-based feature extraction to produce a robust NLI classifier. Across SNLI, MultiNLI, and Quora datasets, DIIN achieves state-of-the-art results, with ablations demonstrating the critical roles of EM features, the interaction tensor, and dense inter-layer connections. The work suggests promising future directions including integrating external commonsense knowledge to further enhance cross-sentence understanding.

Abstract

Natural Language Inference (NLI) task requires an agent to determine the logical relationship between a natural language premise and a natural language hypothesis. We introduce Interactive Inference Network (IIN), a novel class of neural network architectures that is able to achieve high-level understanding of the sentence pair by hierarchically extracting semantic features from interaction space. We show that an interaction tensor (attention weight) contains semantic information to solve natural language inference, and a denser interaction tensor contains richer semantic information. One instance of such architecture, Densely Interactive Inference Network (DIIN), demonstrates the state-of-the-art performance on large scale NLI copora and large-scale NLI alike corpus. It's noteworthy that DIIN achieve a greater than 20% error reduction on the challenging Multi-Genre NLI (MultiNLI) dataset with respect to the strongest published system.

Paper Structure

This paper contains 26 sections, 6 equations, 2 figures, 7 tables.

Figures (2)

  • Figure 1: A visual illustration of Interactive Inference Network (IIN).
  • Figure 2: A visualization of hidden representation. The premise is "South Carolina has no referendum right, so the Supreme Court canceled the vote and upheld the ban." and the hypothesis is "South Carolina has a referendum right, so the Supreme Court was powerless over the state.". The upper row are sampled from interaction tensor $I$ and the lower row are sample from the feature map of first dense block. We use viridis colormap, where yellow represents activation and purple shows the neuron is not active.