Language Models as Models of Language
Raphaël Millière
TL;DR
This chapter critically evaluates whether modern language models can illuminate theoretical linguistics, especially syntax and language acquisition. It surveys historical trajectories, three empirical methodologies (behavioural tests, probing, interventions), and three modelling targets (performance, competence, acquisition). It presents converging evidence that Transformer-based LMs learn hierarchical syntax and exhibit causal representations, while also highlighting methodological caveats and remaining gaps. It argues for cautious integration and closer collaboration to use LMs as constraint tools in debates about linguistic nativism and learnability.
Abstract
This chapter critically examines the potential contributions of modern language models to theoretical linguistics. Despite their focus on engineering goals, these models' ability to acquire sophisticated linguistic knowledge from mere exposure to data warrants a careful reassessment of their relevance to linguistic theory. I review a growing body of empirical evidence suggesting that language models can learn hierarchical syntactic structure and exhibit sensitivity to various linguistic phenomena, even when trained on developmentally plausible amounts of data. While the competence/performance distinction has been invoked to dismiss the relevance of such models to linguistic theory, I argue that this assessment may be premature. By carefully controlling learning conditions and making use of causal intervention methods, experiments with language models can potentially constrain hypotheses about language acquisition and competence. I conclude that closer collaboration between theoretical linguists and computational researchers could yield valuable insights, particularly in advancing debates about linguistic nativism.
