Table of Contents
Fetching ...

BUCA: A Binary Classification Approach to Unsupervised Commonsense Question Answering

Jie He, Simon Chi Lok U, Víctor Gutiérrez-Basulto, Jeff Z. Pan

TL;DR

This paper tackles unsupervised commonsense question answering by converting knowledge-graph triples into text-based QA pairs and training a binary classifier to judge answer reasonableness. BUCA leverages templates to produce positive and negative QA pairs from ConceptNet and ATOMIC, and it employs a combination of traditional binary loss, margin ranking loss, and supervised contrastive loss to optimize a pretrained LM. Inference scores each candidate answer for a question and selects the most reasonable option, enabling zero-shot-like performance with relatively little data. Experiments on five benchmarks show BUCA achieving strong results with substantially less training data than several KG-based baselines, highlighting the method’s data efficiency and effectiveness for unsupervised commonsense reasoning.

Abstract

Unsupervised commonsense reasoning (UCR) is becoming increasingly popular as the construction of commonsense reasoning datasets is expensive, and they are inevitably limited in their scope. A popular approach to UCR is to fine-tune language models with external knowledge (e.g., knowledge graphs), but this usually requires a large number of training examples. In this paper, we propose to transform the downstream multiple choice question answering task into a simpler binary classification task by ranking all candidate answers according to their reasonableness. To this end, for training the model, we convert the knowledge graph triples into reasonable and unreasonable texts. Extensive experimental results show the effectiveness of our approach on various multiple choice question answering benchmarks. Furthermore, compared with existing UCR approaches using KGs, ours is less data hungry. Our code is available at https://github.com/probe2/BUCA.

BUCA: A Binary Classification Approach to Unsupervised Commonsense Question Answering

TL;DR

This paper tackles unsupervised commonsense question answering by converting knowledge-graph triples into text-based QA pairs and training a binary classifier to judge answer reasonableness. BUCA leverages templates to produce positive and negative QA pairs from ConceptNet and ATOMIC, and it employs a combination of traditional binary loss, margin ranking loss, and supervised contrastive loss to optimize a pretrained LM. Inference scores each candidate answer for a question and selects the most reasonable option, enabling zero-shot-like performance with relatively little data. Experiments on five benchmarks show BUCA achieving strong results with substantially less training data than several KG-based baselines, highlighting the method’s data efficiency and effectiveness for unsupervised commonsense reasoning.

Abstract

Unsupervised commonsense reasoning (UCR) is becoming increasingly popular as the construction of commonsense reasoning datasets is expensive, and they are inevitably limited in their scope. A popular approach to UCR is to fine-tune language models with external knowledge (e.g., knowledge graphs), but this usually requires a large number of training examples. In this paper, we propose to transform the downstream multiple choice question answering task into a simpler binary classification task by ranking all candidate answers according to their reasonableness. To this end, for training the model, we convert the knowledge graph triples into reasonable and unreasonable texts. Extensive experimental results show the effectiveness of our approach on various multiple choice question answering benchmarks. Furthermore, compared with existing UCR approaches using KGs, ours is less data hungry. Our code is available at https://github.com/probe2/BUCA.
Paper Structure (26 sections, 3 equations, 1 figure, 11 tables)

This paper contains 26 sections, 3 equations, 1 figure, 11 tables.

Figures (1)

  • Figure 1: After BUCA is trained on the above question from the training set, it is then able to rate the reasonableness of each sentence of the downstream task.