No Such Thing as a General Learner: Language models and their dual optimization
Emmanuel Chemla, Ryan M. Nefdt
TL;DR
The paper tackles whether LLMs constitute general learners and what their behavior implies for human language acquisition. It argues that neither humans nor LLMs are general learners and introduces a dual-optimization view—training-time objective plus evolution-like selection—that shapes LLMs. By examining benchmarks, learning trajectories, and impossible-language experiments, the authors show that LLM performance cannot be straightforwardly used to settle debates about human biases or innate language faculties, since these systems are heavily engineered and selected. The work cautions against overextending cognitive-science inferences from LLMs, while outlining how these models can still inform our understanding of language learning when their developmental history and selection pressures are properly accounted for.
Abstract
What role can the otherwise successful Large Language Models (LLMs) play in the understanding of human cognition, and in particular in terms of informing language acquisition debates? To contribute to this question, we first argue that neither humans nor LLMs are general learners, in a variety of senses. We make a novel case for how in particular LLMs follow a dual-optimization process: they are optimized during their training (which is typically compared to language acquisition), and modern LLMs have also been selected, through a process akin to natural selection in a species. From this perspective, we argue that the performance of LLMs, whether similar or dissimilar to that of humans, does not weigh easily on important debates about the importance of human cognitive biases for language.
