A Unified Assessment of the Poverty of the Stimulus Argument for Neural Language Models
Xiulin Yang, Arianna Bisazza, Nathan Schneider, Ethan Gotlieb Wilcox
TL;DR
This paper tests the Poverty of the Stimulus (PoS) claim for neural language learners using PoSH-Bench, a developmentally plausible training/evaluation suite targeting key PoS phenomena. Using GPT-2 variants trained on 10–50M words with manipulated direct evidence, the authors show that transformers can generalize above chance even without direct positive evidence, but remain less data-efficient than children. They further test three cognitively motivated inductive biases and find that while these biases improve general syntactic competence, they do not close the PoS-specific efficiency gap. The findings challenge the view that innate, language-specific constraints are the sole route to robust generalization and suggest that human-like data efficiency may require additional mechanisms, possibly multimodal or caregiver-mediated signals.
Abstract
How can children acquire native-level syntax from limited input? According to the Poverty of the Stimulus Hypothesis (PoSH), the linguistic input children receive is insufficient to explain certain generalizations that are robustly learned; innate linguistic constraints, many have argued, are thus necessary to explain language learning. Neural language models, which lack such language-specific constraints in their design, offer a computational test of this longstanding (but controversial) claim. We introduce \poshbench, a training-and-evaluation suite targeting question formation, islands to movement, and other English phenomena at the center of the PoSH arguments. Training Transformer models on 10--50M words of developmentally plausible text, we find indications of generalization on all phenomena even without direct positive evidence -- yet neural models remain less data-efficient and their generalizations are weaker than those of children. We further enhance our models with three recently proposed cognitively motivated inductive biases. We find these biases improve general syntactic competence but not \poshbench performance. Our findings challenge the claim that innate syntax is the only possible route to generalization, while suggesting that human-like data efficiency requires inductive biases beyond those tested here.
