Table of Contents
Fetching ...

Generalizations across filler-gap dependencies in neural language models

Katherine Howitt, Sathvik Nair, Allison Dods, Robert Melvin Hopkins

TL;DR

This work explicitly control the input to a neural language model to uncover whether the model posits a shared representation for filler-gap dependencies, and shows that while NLMs do have success differentiating grammatical from ungrammatical FGDs, they rely on superficial properties of the input, rather than on a shared generalization.

Abstract

Humans develop their grammars by making structural generalizations from finite input. We ask how filler-gap dependencies, which share a structural generalization despite diverse surface forms, might arise from the input. We explicitly control the input to a neural language model (NLM) to uncover whether the model posits a shared representation for filler-gap dependencies. We show that while NLMs do have success differentiating grammatical from ungrammatical filler-gap dependencies, they rely on superficial properties of the input, rather than on a shared generalization. Our work highlights the need for specific linguistic inductive biases to model language acquisition.

Generalizations across filler-gap dependencies in neural language models

TL;DR

This work explicitly control the input to a neural language model to uncover whether the model posits a shared representation for filler-gap dependencies, and shows that while NLMs do have success differentiating grammatical from ungrammatical FGDs, they rely on superficial properties of the input, rather than on a shared generalization.

Abstract

Humans develop their grammars by making structural generalizations from finite input. We ask how filler-gap dependencies, which share a structural generalization despite diverse surface forms, might arise from the input. We explicitly control the input to a neural language model (NLM) to uncover whether the model posits a shared representation for filler-gap dependencies. We show that while NLMs do have success differentiating grammatical from ungrammatical filler-gap dependencies, they rely on superficial properties of the input, rather than on a shared generalization. Our work highlights the need for specific linguistic inductive biases to model language acquisition.

Paper Structure

This paper contains 22 sections, 4 figures, 7 tables.

Figures (4)

  • Figure 1: Filler effects for simple constructions for the pretrained model and Cleft-RNN.
  • Figure 2: Filler effects for simple and island constructions for the pretrained model and Cleft-RNN. Since this dependency was not learned for topicalization, we do not display these results.
  • Figure 3: Filler effects for simple and island constructions for the pretrained model and Topic-RNN.
  • Figure 4: Filler effects for a pretrained GPT-2 model, evaluated on the same set of sentences as the RNNs.