Table of Contents
Fetching ...

The Emergence of Grammar through Reinforcement Learning

Stephen Wechsler, James W. Shearer, Katrin Erk

TL;DR

This work shows how complex grammatical systems can emerge from reinforcement learning driven by speakers’ expressive preferences. By embedding message probabilities into Harley-Roth-Erev learning, the authors derive a fundamental model (Cat Walking in Grass) where semantic composition and interpretation arise as convergence results, and forgetting accelerates convergence while cross-speaker diversity remains compatible with a single emergent grammar. The paper extends this framework with similarity-based learning across verbs, recursion, and explicit forms, culminating in a Form Competition Model that accounts for grammaticalization and case-making in languages like English. The combination of analytic theorems and numerical simulations provides a rigorous usage-based foundation for grammar emergence, with two historical English case studies illustrating the model’s explanatory power and efficiency considerations. The approach offers a quantitative, testable path for connecting cognitive learning processes to historical and cross-linguistic grammar structure.

Abstract

The evolution of grammatical systems of syntactic and semantic composition is modeled here with a novel application of reinforcement learning theory. To test the functionalist thesis that speakers' expressive purposes shape their language, we include within the model a probability distribution over different messages that could be expressed in a given context. The proposed learning and production algorithm then breaks down language learning into a sequence of simple steps, such that each step benefits from the message probabilities. The results are presented in the form of numerical simulations of language histories and analytic proofs. The potential for applying these mathematical models to the study of natural language is illustrated with two case studies from the history of English.

The Emergence of Grammar through Reinforcement Learning

TL;DR

This work shows how complex grammatical systems can emerge from reinforcement learning driven by speakers’ expressive preferences. By embedding message probabilities into Harley-Roth-Erev learning, the authors derive a fundamental model (Cat Walking in Grass) where semantic composition and interpretation arise as convergence results, and forgetting accelerates convergence while cross-speaker diversity remains compatible with a single emergent grammar. The paper extends this framework with similarity-based learning across verbs, recursion, and explicit forms, culminating in a Form Competition Model that accounts for grammaticalization and case-making in languages like English. The combination of analytic theorems and numerical simulations provides a rigorous usage-based foundation for grammar emergence, with two historical English case studies illustrating the model’s explanatory power and efficiency considerations. The approach offers a quantitative, testable path for connecting cognitive learning processes to historical and cross-linguistic grammar structure.

Abstract

The evolution of grammatical systems of syntactic and semantic composition is modeled here with a novel application of reinforcement learning theory. To test the functionalist thesis that speakers' expressive purposes shape their language, we include within the model a probability distribution over different messages that could be expressed in a given context. The proposed learning and production algorithm then breaks down language learning into a sequence of simple steps, such that each step benefits from the message probabilities. The results are presented in the form of numerical simulations of language histories and analytic proofs. The potential for applying these mathematical models to the study of natural language is illustrated with two case studies from the history of English.

Paper Structure

This paper contains 45 sections, 14 theorems, 31 equations, 8 figures, 9 tables.

Key Result

Theorem 1

(Fundamental Theorem: Emergence of Semantic Composition) Suppose $p(m_1 | s) > p (m_2 | s)$ in the production algorithm for a walking event described above. Then, for any values of $\alpha \geq 0$, $c_1^0,\, c_2^0 >0$, as the number of utterances in the language history grows we have a) the count ra

Figures (8)

  • Figure 1: Acts of reference (form-scene maps) from two sample 1-word utterances
  • Figure 2: Acts of reference (form-scene maps) from two sample 2-word phrasal utterances
  • Figure 3: Fast convergence when the probability of the first message (blue) is larger than others.
  • Figure 4: Initially ineffective conditioning due to a large $\alpha$ parameter.
  • Figure 5: Large start value on message with lowest probability delays but doesn't stop convergence of highest probability message.
  • ...and 3 more figures

Theorems & Definitions (14)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • Lemma 1
  • Lemma 2
  • Theorem 7
  • Theorem 8
  • ...and 4 more