Table of Contents
Fetching ...

Penguins Don't Fly: Reasoning about Generics through Instantiations and Exceptions

Emily Allaway, Jena D. Hwang, Chandra Bhagavatula, Kathleen McKeown, Doug Downey, Yejin Choi

TL;DR

This work introduces a linguistically grounded framework to automatically generate exemplars for generic statements, producing instantiations and counterexamples (exemplars) to better capture how generics function in real-world reasoning. By formalizing three generic categories with corresponding logical forms, the authors derive templates and employ NeuroLogic$^{\star}$ constrained decoding to control generation, followed by viability and validity filtering with RoBERTa-based discriminators. Across ~653 generics, the method yields ~19k exemplars and outperforms GPT-3 by ~12.8 precision points in human evaluations, demonstrating improved controllability and quality, especially for exceptions. The study highlights the limitations of commonsense knowledge bases for exemplars, the necessity of linguistic-theory-guided decoding, and the current challenges of aligning exemplars with natural language inference, underscoring areas for future work in reasoning with defaults and counterexamples.

Abstract

Generics express generalizations about the world (e.g., birds can fly) that are not universally true (e.g., newborn birds and penguins cannot fly). Commonsense knowledge bases, used extensively in NLP, encode some generic knowledge but rarely enumerate such exceptions and knowing when a generic statement holds or does not hold true is crucial for developing a comprehensive understanding of generics. We present a novel framework informed by linguistic theory to generate exemplars -- specific cases when a generic holds true or false. We generate ~19k exemplars for ~650 generics and show that our framework outperforms a strong GPT-3 baseline by 12.8 precision points. Our analysis highlights the importance of linguistic theory-based controllability for generating exemplars, the insufficiency of knowledge bases as a source of exemplars, and the challenges exemplars pose for the task of natural language inference.

Penguins Don't Fly: Reasoning about Generics through Instantiations and Exceptions

TL;DR

This work introduces a linguistically grounded framework to automatically generate exemplars for generic statements, producing instantiations and counterexamples (exemplars) to better capture how generics function in real-world reasoning. By formalizing three generic categories with corresponding logical forms, the authors derive templates and employ NeuroLogic constrained decoding to control generation, followed by viability and validity filtering with RoBERTa-based discriminators. Across ~653 generics, the method yields ~19k exemplars and outperforms GPT-3 by ~12.8 precision points in human evaluations, demonstrating improved controllability and quality, especially for exceptions. The study highlights the limitations of commonsense knowledge bases for exemplars, the necessity of linguistic-theory-guided decoding, and the current challenges of aligning exemplars with natural language inference, underscoring areas for future work in reasoning with defaults and counterexamples.

Abstract

Generics express generalizations about the world (e.g., birds can fly) that are not universally true (e.g., newborn birds and penguins cannot fly). Commonsense knowledge bases, used extensively in NLP, encode some generic knowledge but rarely enumerate such exceptions and knowing when a generic statement holds or does not hold true is crucial for developing a comprehensive understanding of generics. We present a novel framework informed by linguistic theory to generate exemplars -- specific cases when a generic holds true or false. We generate ~19k exemplars for ~650 generics and show that our framework outperforms a strong GPT-3 baseline by 12.8 precision points. Our analysis highlights the importance of linguistic theory-based controllability for generating exemplars, the insufficiency of knowledge bases as a source of exemplars, and the challenges exemplars pose for the task of natural language inference.
Paper Structure (56 sections, 3 equations, 8 figures, 15 tables)

This paper contains 56 sections, 3 equations, 8 figures, 15 tables.

Figures (8)

  • Figure 1: We present exemplars generator: given a generic like "Birds can fly" it generates truthful statements where the generic does (instantiations) and does not (exceptions) hold. We extract commonsense knowledge (e.g., from ConceptNet speer2017conceptnet) in linguistically-informed prompts and constraints for constrained generation lu2021neurologic. We use trained discriminators to filter for quality.
  • Figure 2: Overview of our method for an input generic.
  • Figure 3: exemplars and correct NLI labels.
  • Figure 4: Task instructions for first part of the generic type categorization annotation (§\ref{['sec:anndetails']}).
  • Figure 5: Task instructions for second part of the generic type categorization annotation (§\ref{['sec:anndetails']}).
  • ...and 3 more figures

Theorems & Definitions (2)

  • Definition : Instantiations
  • Definition : Exceptions