Table of Contents
Fetching ...

Performance and competence intertwined: A computational model of the Null Subject stage in English-speaking children

Soumik Dey, William Gregory Sakas

TL;DR

This work models the English NS stage as a superset-subset grammar learning problem by extending the Variational Learner with a Superset-Subset framework (SSVL) and a performance parameter, Illocution Ambiguity Resolution Coefficient (IARC). Using the CUNY-CoLAG domain, it demonstrates that early misinterpretations of imperative NS sentences as declaratives can bias learning, but a dual-rate reward strategy guides convergence toward the target superset grammar. The paper provides a principled, computational approach to integrate performance constraints with grammatical acquisition and offers insights into how developmentally changing interpretive abilities shape the NS stage. The framework and findings have broader implications for modeling how performance factors interact with parameter learning in natural language acquisition.

Abstract

The empirically established null subject (NS) stage, lasting until about 4 years of age, involves frequent omission of subjects by children. Orfitelli and Hyams (2012) observe that young English speakers often confuse imperative NS utterances with declarative ones due to performance influences, promoting a temporary null subject grammar. We propose a new computational parameter to measure this misinterpretation and incorporate it into a simulated model of obligatory subject grammar learning. Using a modified version of the Variational Learner (Yang, 2012) which works for superset-subset languages, our simulations support Orfitelli and Hyams' hypothesis. More generally, this study outlines a framework for integrating computational models in the study of grammatical acquisition alongside other key developmental factors.

Performance and competence intertwined: A computational model of the Null Subject stage in English-speaking children

TL;DR

This work models the English NS stage as a superset-subset grammar learning problem by extending the Variational Learner with a Superset-Subset framework (SSVL) and a performance parameter, Illocution Ambiguity Resolution Coefficient (IARC). Using the CUNY-CoLAG domain, it demonstrates that early misinterpretations of imperative NS sentences as declaratives can bias learning, but a dual-rate reward strategy guides convergence toward the target superset grammar. The paper provides a principled, computational approach to integrate performance constraints with grammatical acquisition and offers insights into how developmentally changing interpretive abilities shape the NS stage. The framework and findings have broader implications for modeling how performance factors interact with parameter learning in natural language acquisition.

Abstract

The empirically established null subject (NS) stage, lasting until about 4 years of age, involves frequent omission of subjects by children. Orfitelli and Hyams (2012) observe that young English speakers often confuse imperative NS utterances with declarative ones due to performance influences, promoting a temporary null subject grammar. We propose a new computational parameter to measure this misinterpretation and incorporate it into a simulated model of obligatory subject grammar learning. Using a modified version of the Variational Learner (Yang, 2012) which works for superset-subset languages, our simulations support Orfitelli and Hyams' hypothesis. More generally, this study outlines a framework for integrating computational models in the study of grammatical acquisition alongside other key developmental factors.

Paper Structure

This paper contains 17 sections, 8 equations, 6 figures, 5 tables, 4 algorithms.

Figures (6)

  • Figure 1: Performance on NS condition sentences from 10.2307/23358022[Figure 5].
  • Figure 2: The SSVL with a conservative learning rate of $r=1.24\times 10^{-7}$. The NS parameter weight is plotted on the y-axis and the number of utterances on the x-axis. Additionally, 6 month intervals from age 2;6 to 4;0 as measured in number of utterances are marked.
  • Figure 3: Performance of 100 e-children with a bin size of 20 and the Gaussian kernel estimation of the probability density function (PDF) across 3 age groups generated using a truncated Gaussian distribution emulating O&H.
  • Figure 4: $IARC_{linear}$ growth function
  • Figure 5: $IARC_{logistic}$ growth function
  • ...and 1 more figures