Table of Contents
Fetching ...

Modelling Interaction of Sentence Pair with coupled-LSTMs

Pengfei Liu, Xipeng Qiu, Xuanjing Huang

TL;DR

The paper tackles sentence-pair semantic matching by shifting from independent sentence encoders to coupled-LSTMs that model strong, bidirectional interactions. It introduces two variants, LC-LSTMs and TC-LSTMs, with four-directional processing and an aggregation/dynamic-pooling scheme to capture multi-granularity interactions end-to-end. Empirical results on large-scale SNLI entailment and Yahoo! Answers question-answer tasks show the proposed models achieve state-of-the-art or competitive performance with fewer parameters, and neuron-level analyses reveal interpretable interaction patterns. The work suggests future enhancements via gating mechanisms to further deepen depth- interaction and performance.

Abstract

Recently, there is rising interest in modelling the interactions of two sentences with deep neural networks. However, most of the existing methods encode two sequences with separate encoders, in which a sentence is encoded with little or no information from the other sentence. In this paper, we propose a deep architecture to model the strong interaction of sentence pair with two coupled-LSTMs. Specifically, we introduce two coupled ways to model the interdependences of two LSTMs, coupling the local contextualized interactions of two sentences. We then aggregate these interactions and use a dynamic pooling to select the most informative features. Experiments on two very large datasets demonstrate the efficacy of our proposed architecture and its superiority to state-of-the-art methods.

Modelling Interaction of Sentence Pair with coupled-LSTMs

TL;DR

The paper tackles sentence-pair semantic matching by shifting from independent sentence encoders to coupled-LSTMs that model strong, bidirectional interactions. It introduces two variants, LC-LSTMs and TC-LSTMs, with four-directional processing and an aggregation/dynamic-pooling scheme to capture multi-granularity interactions end-to-end. Empirical results on large-scale SNLI entailment and Yahoo! Answers question-answer tasks show the proposed models achieve state-of-the-art or competitive performance with fewer parameters, and neuron-level analyses reveal interpretable interaction patterns. The work suggests future enhancements via gating mechanisms to further deepen depth- interaction and performance.

Abstract

Recently, there is rising interest in modelling the interactions of two sentences with deep neural networks. However, most of the existing methods encode two sequences with separate encoders, in which a sentence is encoded with little or no information from the other sentence. In this paper, we propose a deep architecture to model the strong interaction of sentence pair with two coupled-LSTMs. Specifically, we introduce two coupled ways to model the interdependences of two LSTMs, coupling the local contextualized interactions of two sentences. We then aggregate these interactions and use a dynamic pooling to select the most informative features. Experiments on two very large datasets demonstrate the efficacy of our proposed architecture and its superiority to state-of-the-art methods.

Paper Structure

This paper contains 35 sections, 10 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Four different coupled-LSTMs.
  • Figure 2: Architecture of coupled-LSTMs for sentence-pair encoding. Inputs are fed to four C-LSTMs followed by an aggregation layer. Blue cuboids represent different contextual information from four directions.
  • Figure 3: Illustration of two interpretable neurons and some word-pairs capture by these neurons. The darker patches denote the corresponding activations are higher.