Table of Contents
Fetching ...

A Hybrid Neural Network Model for Commonsense Reasoning

Pengcheng He, Xiaodong Liu, Weizhu Chen, Jianfeng Gao

TL;DR

The paper tackles pronoun resolution and broader commonsense reasoning by proposing a Hybrid Neural Network (HNN) that unites a masked language model (MLM) and a semantic similarity model (SSM) under a shared BERT encoder. By training with a multi-task objective on the WSCR dataset and incorporating a ranking loss, HNN achieves state-of-the-art results on WNLI, WSC, and PDP60, demonstrating that language-model based and similarity-based cues are complementary. Ablation studies confirm the two components’ complementary roles and the benefit of the ranking objective. The approach highlights the value of hybrid, multi-task strategies for tackling challenging commonsense tasks and suggests avenues for extending to more complex reasoning problems.

Abstract

This paper proposes a hybrid neural network (HNN) model for commonsense reasoning. An HNN consists of two component models, a masked language model and a semantic similarity model, which share a BERT-based contextual encoder but use different model-specific input and output layers. HNN obtains new state-of-the-art results on three classic commonsense reasoning tasks, pushing the WNLI benchmark to 89%, the Winograd Schema Challenge (WSC) benchmark to 75.1%, and the PDP60 benchmark to 90.0%. An ablation study shows that language models and semantic similarity models are complementary approaches to commonsense reasoning, and HNN effectively combines the strengths of both. The code and pre-trained models will be publicly available at https://github.com/namisan/mt-dnn.

A Hybrid Neural Network Model for Commonsense Reasoning

TL;DR

The paper tackles pronoun resolution and broader commonsense reasoning by proposing a Hybrid Neural Network (HNN) that unites a masked language model (MLM) and a semantic similarity model (SSM) under a shared BERT encoder. By training with a multi-task objective on the WSCR dataset and incorporating a ranking loss, HNN achieves state-of-the-art results on WNLI, WSC, and PDP60, demonstrating that language-model based and similarity-based cues are complementary. Ablation studies confirm the two components’ complementary roles and the benefit of the ranking objective. The approach highlights the value of hybrid, multi-task strategies for tackling challenging commonsense tasks and suggests avenues for extending to more complex reasoning problems.

Abstract

This paper proposes a hybrid neural network (HNN) model for commonsense reasoning. An HNN consists of two component models, a masked language model and a semantic similarity model, which share a BERT-based contextual encoder but use different model-specific input and output layers. HNN obtains new state-of-the-art results on three classic commonsense reasoning tasks, pushing the WNLI benchmark to 89%, the Winograd Schema Challenge (WSC) benchmark to 75.1%, and the PDP60 benchmark to 90.0%. An ablation study shows that language models and semantic similarity models are complementary approaches to commonsense reasoning, and HNN effectively combines the strengths of both. The code and pre-trained models will be publicly available at https://github.com/namisan/mt-dnn.

Paper Structure

This paper contains 11 sections, 8 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Architecture of the hybrid model for commonsense reasoning. The model consists of two component models, a masked language model (MLM) and a semantic similarity model (SSM). The input includes the sentence $S$, which contains a pronoun to be resolve, and a candidate antecedent $C$. The two component models share the BERT-based contextual encoder, but use different model-specific input and output layers. The final output score is the combination of the two component model scores.
  • Figure 2: Comparison with SSM and MLM on WNLI examples.
  • Figure 3: Comparison of different task formulation on WNLI.