Stochastic Answer Networks for Natural Language Inference
Xiaodong Liu, Kevin Duh, Jianfeng Gao
TL;DR
Natural language inference demands multi-step reasoning; this work introduces a stochastic answer network (SAN) that maintains a state and iteratively refines predictions over multiple passes through premise and hypothesis. SAN comprises four layers (lexicon encoding, contextual encoding, memory generation, and answer module) and uses a memory-augmented, attention-driven mechanism with stochastic prediction dropout. Empirical results show SAN outperforms single-step baselines and achieves state-of-the-art on SciTail and Quora, while remaining competitive on SNLI and MultiNLI, with further gains when paired with BERT. The findings highlight the effectiveness of multi-step inference for NLI and suggest avenues for leveraging pretrained contextual embeddings and multi-task learning.
Abstract
We propose a stochastic answer network (SAN) to explore multi-step inference strategies in Natural Language Inference. Rather than directly predicting the results given the inputs, the model maintains a state and iteratively refines its predictions. Our experiments show that SAN achieves the state-of-the-art results on three benchmarks: Stanford Natural Language Inference (SNLI) dataset, MultiGenre Natural Language Inference (MultiNLI) dataset and Quora Question Pairs dataset.
