Opening the black box of language acquisition

Jérôme Michaud; Anna Jon-and

Opening the black box of language acquisition

Jérôme Michaud, Anna Jon-and

TL;DR

The paper presents a minimal, cognitively plausible architecture for language learning that replaces neural networks with sequence memory and hierarchical chunking, learned via error-correction temporal-difference reinforcement. Framed as a Markov Decision Process, the learner identifies sentence boundaries in input streams generated by probabilistic context-free grammars, using state–action values $Q(s,a)$ and a softmax policy to drive decisions. Across NVN baselines and increasingly complex artificial languages (MD, RelClause, ComplexNP), the model demonstrates (i) successful learning from small data, (ii) robust extraction and reuse of grammatical chunks, and (iii) emergent structures that parallel human language learning under memory constraints. The work highlights the potential sufficiency of simple, cognitively grounded mechanisms—sequence memory, chunking, and TD reinforcement—for grammar emergence, offering a tractable, interpretable alternative to opaque deep learning approaches and informing theories of human language acquisition.

Abstract

Recent advances in large language models using deep learning techniques have renewed interest on how languages can be learned from data. However, it is unclear whether or how these models represent grammatical information from the learned languages. In addition, the models must be pre-trained on large corpora before they can be used. In this work, we propose an alternative, more transparent and cognitively plausible architecture for learning language. Instead of using deep learning, our approach uses a minimal cognitive architecture based on sequence memory and chunking. The learning mechanism is based on the principles of reinforcement learning. We test our architecture on a number of natural-like toy languages. Results show that the model can learn these artificial languages from scratch and extract grammatical information that supports learning. Our study demonstrates the power of this simple architecture and stresses the importance of sequence memory as a key component of the language learning process. Since other animals do not seem to have a faithful sequence memory, this may explain why only humans have developed complex languages.

Opening the black box of language acquisition

TL;DR

and a softmax policy to drive decisions. Across NVN baselines and increasingly complex artificial languages (MD, RelClause, ComplexNP), the model demonstrates (i) successful learning from small data, (ii) robust extraction and reuse of grammatical chunks, and (iii) emergent structures that parallel human language learning under memory constraints. The work highlights the potential sufficiency of simple, cognitively grounded mechanisms—sequence memory, chunking, and TD reinforcement—for grammar emergence, offering a tractable, interpretable alternative to opaque deep learning approaches and informing theories of human language acquisition.

Abstract

Paper Structure (26 sections, 14 equations, 5 figures, 8 tables)

This paper contains 26 sections, 14 equations, 5 figures, 8 tables.

Introduction
Background
Divergences in language learning in LLMs and humans
Foundation of the human linguistic capacity: Sequence memory and chunking
Language models learning formal languages
Computational framework
The learning task
MDP formalism: states, actions, and rewards
Specifying the language learning task as a MDP
Learning algorithms
State-action values, sub-states, and sub-actions
Updating state-action pairs
Using state-action pairs to make decisions
Evaluation methods
Extracting learning curves
...and 11 more sections

Figures (5)

Figure 1: Illustration of the four possible actions. On the right, the results of the four actions are shown. As can be seen, there are three different ways of chunking the second element into the first.
Figure 2: Learning curves for the four combinations of sentence conditions QC, QN, RWQC, and RWQN. The NVN language has $K_n = K_v = 5$. Fractions are obtained over 100 agents.
Figure 3: First panel: Learning curve for the Rescorla-Wagner Q-learning with continuous border condition for the MD language with $K_n=5$, $K_m=K_d=1$. Fractions are obtained over 200 agents. Second panel, breakdown of the learning curve by sentence length. Note that if at a given trial no learner encounter a sentence of a given length, then it contributes to sentences of length 0, which are, therefore, meaningless.
Figure 4: First panel: Learning curve for the Rescorla-Wagner Q-learning with continuous border condition for the relative clause language with $K_n=5$, $K_m=K_d=K_r=1$. Fractions are obtained over 100 agents. Second panel: breakdown of the learning curve by sentence length. Note that if at a given trial no learner encounter a sentence of a given length, then it contributes to sentences of length 0, which are, therefore, meaningless. Since only a fraction of learners encounter a sentence of the same length, explaining the layering structure of the breakdown plot.
Figure 5: Learning curve for the Rescorla-Wagner Q-learning with continuous border condition for the ComplexNP language with $K_n=K_m=K_v=K_a=K_d=K_p=1$. Fractions are obtained over 100 agents.

Opening the black box of language acquisition

TL;DR

Abstract

Opening the black box of language acquisition

Authors

TL;DR

Abstract

Table of Contents

Figures (5)