Table of Contents
Fetching ...

COMET-ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs

Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Jeff Da, Keisuke Sakaguchi, Antoine Bosselut, Yejin Choi

TL;DR

This work introduces Atomic-20-20, a large commonsense knowledge graph with 1.33 million tuples across 23 relations, designed to capture knowledge difficult for language models to infer. It formalizes a transfer-learning framework (COMET) that adapts pretrained LMs to generate on-demand, high-quality commonsense tuples, and provides a rigorous head-to-head comparison against ConceptNet, Atomic, and TransOMCS. Empirical results show Atomic-20-20 offers superior accuracy and coverage among CSKGs, and that COMET models trained on Atomic-20-20 outperform GPT-3 in few-shot settings while using far fewer parameters. The findings advocate for CSKG design that targets non-obvious, defeasible knowledge and their integration as both static resources and LM adapters to enhance generalization to unseen entities and events.

Abstract

Recent years have brought about a renewed interest in commonsense representation and reasoning in the field of natural language understanding. The development of new commonsense knowledge graphs (CSKG) has been central to these advances as their diverse facts can be used and referenced by machine learning models for tackling new and challenging tasks. At the same time, there remain questions about the quality and coverage of these resources due to the massive scale required to comprehensively encompass general commonsense knowledge. In this work, we posit that manually constructed CSKGs will never achieve the coverage necessary to be applicable in all situations encountered by NLP agents. Therefore, we propose a new evaluation framework for testing the utility of KGs based on how effectively implicit knowledge representations can be learned from them. With this new goal, we propose ATOMIC 2020, a new CSKG of general-purpose commonsense knowledge containing knowledge that is not readily available in pretrained language models. We evaluate its properties in comparison with other leading CSKGs, performing the first large-scale pairwise study of commonsense knowledge resources. Next, we show that ATOMIC 2020 is better suited for training knowledge models that can generate accurate, representative knowledge for new, unseen entities and events. Finally, through human evaluation, we show that the few-shot performance of GPT-3 (175B parameters), while impressive, remains ~12 absolute points lower than a BART-based knowledge model trained on ATOMIC 2020 despite using over 430x fewer parameters.

COMET-ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs

TL;DR

This work introduces Atomic-20-20, a large commonsense knowledge graph with 1.33 million tuples across 23 relations, designed to capture knowledge difficult for language models to infer. It formalizes a transfer-learning framework (COMET) that adapts pretrained LMs to generate on-demand, high-quality commonsense tuples, and provides a rigorous head-to-head comparison against ConceptNet, Atomic, and TransOMCS. Empirical results show Atomic-20-20 offers superior accuracy and coverage among CSKGs, and that COMET models trained on Atomic-20-20 outperform GPT-3 in few-shot settings while using far fewer parameters. The findings advocate for CSKG design that targets non-obvious, defeasible knowledge and their integration as both static resources and LM adapters to enhance generalization to unseen entities and events.

Abstract

Recent years have brought about a renewed interest in commonsense representation and reasoning in the field of natural language understanding. The development of new commonsense knowledge graphs (CSKG) has been central to these advances as their diverse facts can be used and referenced by machine learning models for tackling new and challenging tasks. At the same time, there remain questions about the quality and coverage of these resources due to the massive scale required to comprehensively encompass general commonsense knowledge. In this work, we posit that manually constructed CSKGs will never achieve the coverage necessary to be applicable in all situations encountered by NLP agents. Therefore, we propose a new evaluation framework for testing the utility of KGs based on how effectively implicit knowledge representations can be learned from them. With this new goal, we propose ATOMIC 2020, a new CSKG of general-purpose commonsense knowledge containing knowledge that is not readily available in pretrained language models. We evaluate its properties in comparison with other leading CSKGs, performing the first large-scale pairwise study of commonsense knowledge resources. Next, we show that ATOMIC 2020 is better suited for training knowledge models that can generate accurate, representative knowledge for new, unseen entities and events. Finally, through human evaluation, we show that the few-shot performance of GPT-3 (175B parameters), while impressive, remains ~12 absolute points lower than a BART-based knowledge model trained on ATOMIC 2020 despite using over 430x fewer parameters.

Paper Structure

This paper contains 19 sections, 8 figures, 12 tables.

Figures (8)

  • Figure 1: A tiny subset of Atomic$^{20}_{20}$, a large atlas of social and physical commonsense relations. Relations in the top-left quadrant reflects relations from Atomic.
  • Figure 2: Atomic$^{20}_{20}$ tuple count distribution compared to Atomicsap2018atomic and ConceptNet, either its commonsense subset li-16 or the full set speer2017conceptnet.
  • Figure 3: Atomic$^{20}_{20}$ relations organized into a hierarchical structure.
  • Figure 4: Percentage distribution of raw accuracy ratings broken down by KB (i.e., breakdown of Table \ref{['tab:precision-results']}). From left to right are the ratings for social-interaction tuples, physical-entity tuples, and event-centered tuples. We use the ConceptNet-to-Atomic$^{20}_{20}$ relation mappings (shown in Table \ref{['tab:conceptnet-atomic-relation-mapping']}) to categorize ConceptNet and TransOMCS relations into the three categories. For multiple mappings, we map the ConceptNet/TransOMCS labels to the majority mapped label (in bold in Table \ref{['tab:conceptnet-atomic-relation-mapping']}). Note that the latter two figures do not include Atomic as the KB only includes social-interaction relations.
  • Figure 5: Mechanical Turk template used to collect ObjectUse tuples. We collected three sets of object affordances per HIT, but two have been truncated to fit this page.
  • ...and 3 more figures