What Artificial Neural Networks Can Tell Us About Human Language Acquisition
Alex Warstadt, Samuel R. Bowman
TL;DR
The chapter investigates how artificial neural networks can illuminate human language acquisition by testing learnability under impoverished conditions through ablations that remove assumed innate advantages. It argues that model learners, when aligned with human-like data constraints and supplemented by multimodal or interactive inputs, can provide proofs of concept for what is learnable and under which environmental and architectural conditions. The authors review evaluation methods (unsupervised and supervised tests, including BLiMP and MSGS) and underscore the importance of proximal, testable links between model behavior and human language knowledge while acknowledging limitations in generalizing from models to humans. They advocate for developing more ecologically valid model learners and learning environments, emphasizing ethical, scalable experimentation and the potential for new cognitive paradigms to advance our understanding of language acquisition and its underlying biases. Overall, the work delineates a pragmatic path toward using neural models as complementary tools for probing learnability, bias, and data efficiency in human language development.
Abstract
Rapid progress in machine learning for natural language processing has the potential to transform debates about how humans learn language. However, the learning environments and biases of current artificial learners and humans diverge in ways that weaken the impact of the evidence obtained from learning simulations. For example, today's most effective neural language models are trained on roughly one thousand times the amount of linguistic data available to a typical child. To increase the relevance of learnability results from computational models, we need to train model learners without significant advantages over humans. If an appropriate model successfully acquires some target linguistic knowledge, it can provide a proof of concept that the target is learnable in a hypothesized human learning scenario. Plausible model learners will enable us to carry out experimental manipulations to make causal inferences about variables in the learning environment, and to rigorously test poverty-of-the-stimulus-style claims arguing for innate linguistic knowledge in humans on the basis of speculations about learnability. Comparable experiments will never be possible with human subjects due to practical and ethical considerations, making model learners an indispensable resource. So far, attempts to deprive current models of unfair advantages obtain sub-human results for key grammatical behaviors such as acceptability judgments. But before we can justifiably conclude that language learning requires more prior domain-specific knowledge than current models possess, we must first explore non-linguistic inputs in the form of multimodal stimuli and multi-agent interaction as ways to make our learners more efficient at learning from limited linguistic input.
