A Hybrid Neural Network Model for Commonsense Reasoning
Pengcheng He, Xiaodong Liu, Weizhu Chen, Jianfeng Gao
TL;DR
The paper tackles pronoun resolution and broader commonsense reasoning by proposing a Hybrid Neural Network (HNN) that unites a masked language model (MLM) and a semantic similarity model (SSM) under a shared BERT encoder. By training with a multi-task objective on the WSCR dataset and incorporating a ranking loss, HNN achieves state-of-the-art results on WNLI, WSC, and PDP60, demonstrating that language-model based and similarity-based cues are complementary. Ablation studies confirm the two components’ complementary roles and the benefit of the ranking objective. The approach highlights the value of hybrid, multi-task strategies for tackling challenging commonsense tasks and suggests avenues for extending to more complex reasoning problems.
Abstract
This paper proposes a hybrid neural network (HNN) model for commonsense reasoning. An HNN consists of two component models, a masked language model and a semantic similarity model, which share a BERT-based contextual encoder but use different model-specific input and output layers. HNN obtains new state-of-the-art results on three classic commonsense reasoning tasks, pushing the WNLI benchmark to 89%, the Winograd Schema Challenge (WSC) benchmark to 75.1%, and the PDP60 benchmark to 90.0%. An ablation study shows that language models and semantic similarity models are complementary approaches to commonsense reasoning, and HNN effectively combines the strengths of both. The code and pre-trained models will be publicly available at https://github.com/namisan/mt-dnn.
