What Artificial Neural Networks Can Tell Us About Human Language Acquisition

Alex Warstadt; Samuel R. Bowman

What Artificial Neural Networks Can Tell Us About Human Language Acquisition

Alex Warstadt, Samuel R. Bowman

TL;DR

The chapter investigates how artificial neural networks can illuminate human language acquisition by testing learnability under impoverished conditions through ablations that remove assumed innate advantages. It argues that model learners, when aligned with human-like data constraints and supplemented by multimodal or interactive inputs, can provide proofs of concept for what is learnable and under which environmental and architectural conditions. The authors review evaluation methods (unsupervised and supervised tests, including BLiMP and MSGS) and underscore the importance of proximal, testable links between model behavior and human language knowledge while acknowledging limitations in generalizing from models to humans. They advocate for developing more ecologically valid model learners and learning environments, emphasizing ethical, scalable experimentation and the potential for new cognitive paradigms to advance our understanding of language acquisition and its underlying biases. Overall, the work delineates a pragmatic path toward using neural models as complementary tools for probing learnability, bias, and data efficiency in human language development.

Abstract

Rapid progress in machine learning for natural language processing has the potential to transform debates about how humans learn language. However, the learning environments and biases of current artificial learners and humans diverge in ways that weaken the impact of the evidence obtained from learning simulations. For example, today's most effective neural language models are trained on roughly one thousand times the amount of linguistic data available to a typical child. To increase the relevance of learnability results from computational models, we need to train model learners without significant advantages over humans. If an appropriate model successfully acquires some target linguistic knowledge, it can provide a proof of concept that the target is learnable in a hypothesized human learning scenario. Plausible model learners will enable us to carry out experimental manipulations to make causal inferences about variables in the learning environment, and to rigorously test poverty-of-the-stimulus-style claims arguing for innate linguistic knowledge in humans on the basis of speculations about learnability. Comparable experiments will never be possible with human subjects due to practical and ethical considerations, making model learners an indispensable resource. So far, attempts to deprive current models of unfair advantages obtain sub-human results for key grammatical behaviors such as acceptability judgments. But before we can justifiably conclude that language learning requires more prior domain-specific knowledge than current models possess, we must first explore non-linguistic inputs in the form of multimodal stimuli and multi-agent interaction as ways to make our learners more efficient at learning from limited linguistic input.

What Artificial Neural Networks Can Tell Us About Human Language Acquisition

TL;DR

Abstract

Paper Structure (31 sections, 6 equations, 4 figures, 3 tables)

This paper contains 31 sections, 6 equations, 4 figures, 3 tables.

Introduction
Evidence from Model Learners
Generalizing Learnability Results from Models to Humans
Why Positive Results Are More Relevant
Applying Ablations to Debates in Language Acquisition
Tests of Human-Like Linguistic Knowledge
Testing for Competence vs. Performance
Unsupervised Tests
Acceptability Judgments, Minimal Pairs, BLiMP
Other Behavioral Predictions: Reading Time, Age-of-Acquisition
Supervised Tests
What Do Out-of-Domain Tests Tell Us About Learnability?
The Learning Environment
Data Quantity
Data Source
...and 16 more sections

Figures (4)

Figure 1: Comparison of human and model linguistic input (# of word tokens).
Figure 2: Example of an experiment following the Poverty of the Stimulus design from the MSGS dataset warstadt2020learning. A model is trained on ambiguous data whose labels are consistent with either a linguistic or a surface generalization, and tested on disambiguating data whose labels support only the linguistic generalization. Light green and darker red shading represents data or features associated with the positive and negative labels/predictions, respectively.
Figure 3: Learning curves adapted from zhang2021when, showing LM improvement in BLiMP performance as a function of the number of words of training data available to the model.
Figure 4: Comparison of BLiMP performance between adult humans and human-scale LMs. Model results are averages over three 100M word miniBERTas reported in zhang2021when.

What Artificial Neural Networks Can Tell Us About Human Language Acquisition

TL;DR

Abstract

What Artificial Neural Networks Can Tell Us About Human Language Acquisition

Authors

TL;DR

Abstract

Table of Contents

Figures (4)