Left-right asymmetry in predicting brain activity from LLMs' representations emerges with their formal linguistic competence
Laurent Bonnasse-Gahot, Christophe Pallier
TL;DR
The study investigates why left-right brain predictivity asymmetry emerges as LLMs train, proposing that formal linguistic competence underlies this effect. Using fMRI data from English and French listeners and a range of benchmarks, the authors show that the left-hemisphere advantage co-emerges with improved performance on formal linguistic tasks (e.g., BLiMP, Zorro, grammar acceptability) and the generation of well-formed text, but not with arithmetic, Dyck languages, or world-knowledge reasoning (ARC, Hellaswag). This pattern generalizes across model families (OLMo-2 7B and Pythia variants) and languages (French), and extends to cerebellar involvement, suggesting a robust link between syntactic/pattern-knowledge competence and LLM-to-brain alignment. The findings imply that syntactic-pattern knowledge primarily drives the observed hemispheric lateralization in brain predictivity, with functional competence lagging behind and contributing later. These insights refine our understanding of how artificial language systems align with human neural processing and point to region-specific developmental trajectories for future exploration.
Abstract
When humans and large language models (LLMs) process the same text, activations in the LLMs correlate with brain activity measured, e.g., with functional magnetic resonance imaging (fMRI). Moreover, it has been shown that, as the training of an LLM progresses, the performance in predicting brain activity from its internal activations improves more in the left hemisphere than in the right one. The aim of the present work is to understand which kind of competence acquired by the LLMs underlies the emergence of this left-right asymmetry. Using the OLMo-2 7B language model at various training checkpoints and fMRI data from English participants, we compare the evolution of the left-right asymmetry in brain scores alongside performance on several benchmarks. We observe that the asymmetry co-emerges with the formal linguistic abilities of the LLM. These abilities are demonstrated in two ways: by the model's capacity to assign a higher probability to an acceptable sentence than to a grammatically unacceptable one within a minimal contrasting pair, or its ability to produce well-formed text. On the opposite, the left-right asymmetry does not correlate with the performance on arithmetic or Dyck language tasks; nor with text-based tasks involving world knowledge and reasoning. We generalize these results to another family of LLMs (Pythia) and another language, namely French. Our observations indicate that the left-right asymmetry in brain predictivity matches the progress in formal linguistic competence (knowledge of linguistic patterns).
