Table of Contents
Fetching ...

Convergent Representations of Linguistic Constructions in Human and Artificial Neural Systems

Pegah Ramezani, Thomas Kinfe, Andreas Maier, Achim Schilling, Patrick Krauss

Abstract

Understanding how the brain processes linguistic constructions is a central challenge in cognitive neuroscience and linguistics. Recent computational studies show that artificial neural language models spontaneously develop differentiated representations of Argument Structure Constructions (ASCs), generating predictions about when and how construction-level information emerges during processing. The present study tests these predictions in human neural activity using electroencephalography (EEG). Ten native English speakers listened to 200 synthetically generated sentences across four construction types (transitive, ditransitive, caused-motion, resultative) while neural responses were recorded. Analyses using time-frequency methods, feature extraction, and machine learning classification revealed construction-specific neural signatures emerging primarily at sentence-final positions, where argument structure becomes fully disambiguated, and most prominently in the alpha band. Pairwise classification showed reliable differentiation, especially between ditransitive and resultative constructions, while other pairs overlapped. Crucially, the temporal emergence and similarity structure of these effects mirror patterns in recurrent and transformer-based language models, where constructional representations arise during integrative processing stages. These findings support the view that linguistic constructions are neurally encoded as distinct form-meaning mappings, in line with Construction Grammar, and suggest convergence between biological and artificial systems on similar representational solutions. More broadly, this convergence is consistent with the idea that learning systems discover stable regions within an underlying representational landscape - recently termed a Platonic representational space - that constrains the emergence of efficient linguistic abstractions.

Convergent Representations of Linguistic Constructions in Human and Artificial Neural Systems

Abstract

Understanding how the brain processes linguistic constructions is a central challenge in cognitive neuroscience and linguistics. Recent computational studies show that artificial neural language models spontaneously develop differentiated representations of Argument Structure Constructions (ASCs), generating predictions about when and how construction-level information emerges during processing. The present study tests these predictions in human neural activity using electroencephalography (EEG). Ten native English speakers listened to 200 synthetically generated sentences across four construction types (transitive, ditransitive, caused-motion, resultative) while neural responses were recorded. Analyses using time-frequency methods, feature extraction, and machine learning classification revealed construction-specific neural signatures emerging primarily at sentence-final positions, where argument structure becomes fully disambiguated, and most prominently in the alpha band. Pairwise classification showed reliable differentiation, especially between ditransitive and resultative constructions, while other pairs overlapped. Crucially, the temporal emergence and similarity structure of these effects mirror patterns in recurrent and transformer-based language models, where constructional representations arise during integrative processing stages. These findings support the view that linguistic constructions are neurally encoded as distinct form-meaning mappings, in line with Construction Grammar, and suggest convergence between biological and artificial systems on similar representational solutions. More broadly, this convergence is consistent with the idea that learning systems discover stable regions within an underlying representational landscape - recently termed a Platonic representational space - that constrains the emergence of efficient linguistic abstractions.

Paper Structure

This paper contains 28 sections, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Token Duration Variability Across Syntactic Roles. This figure presents boxplots of the durations of subject, verb, and object tokens in the stimulus set. Each boxplot displays the median, interquartile range, whiskers, and outliers, showing that all three syntactic roles exhibit substantial variability in word length. Object tokens show the widest distribution, while verbs are generally shorter. The visualization illustrates the challenge posed by duration differences in natural-language EEG experiments, which motivated the use of maximum-length epoch alignment for ERP analyses and length-independent feature extraction for statistical analyses.
  • Figure 2: ERP Waveforms Time-Locked to Sentence Onset Across All Channels. This figure shows the averaged event-related potentials from 63 EEG channels, aligned to the onset of each sentence. Each colored trace represents one electrode, illustrating the spatial diversity of responses across the scalp. A prominent positive deflection around approximately 200 ms is visible across many channels, corresponding to the P200 component associated with early perceptual and lexical processing. The waveform morphology indicates that the dataset contains robust time-locked neural responses suitable for later time–frequency analyses across construction types.
  • Figure 3: Variation in Sentence Length Across Constructions. This figure presents boxplots of sentence durations for the four construction types used in the experiment: Cause_Motion, Resultative, Ditransitive, and Transitive. Each boxplot shows the median, interquartile range, whiskers, and outliers, illustrating that sentence length varies substantially both within and across constructions. The Transitive sentences tend to be shorter on average, while the other three constructions cluster around longer median durations. This variability motivates standardizing epoch length in EEG analyses, ensuring that neural responses can be aligned and compared despite differences in stimulus duration.
  • Figure 4: Time-frequency representations for each construction type after removing the common average across constructions. The plots show the average Morlet wavelet power (2--45 Hz) for the four construction types (Cause_Motion, Resultative, Ditransitive, Transitive). Each subplot visualizes power over time and frequency with a shared color scale. Subtracting the common average emphasizes where oscillatory dynamics diverge between constructions, thereby making construction-specific spectral patterns visible across corresponding sentence positions.
  • Figure 5: Significant EEG Feature Counts Across Construction Pairs and Syntactic Roles. This figure displays a heatmap summarizing the number of EEG features that showed significant differences (p < 0.05, Benjamini–Hochberg-corrected) for each construction pair and each syntactic role: subject, verb, and object. Lighter colors indicate fewer significant features, while darker colors correspond to more distinguishing features. The subject position shows almost no significant differences across all construction pairs, suggesting that early sentence information does not permit construction discrimination. Verb-aligned epochs exhibit only sparse effects, consistent with verbs being less construction-specific. In contrast, object-aligned epochs show substantially more significant features, particularly for the Ditransitive–Resultative pair, indicating that construction-specific neural signatures emerge primarily when listeners have accumulated sufficient contextual information later in the sentence.
  • ...and 5 more figures