Training-Driven Representational Geometry Modularization Predicts Brain Alignment in Language Models
Yixuan Liu, Zhiyuan Ma, Likai Tang, Runmin Gan, Xinche Zhang, Jinhao Li, Chao Xie, Sen Song
TL;DR
This work investigates how training reshapes the internal geometry of Transformer representations and how these geometric changes relate to brain alignment during language processing. By tracking entropy $E^{(l,s)}$ and curvature $C^{(l,s)}$ across Pythia layers and checkpoints, the authors reveal a stable modularization into low- and high-complexity layers, with the low-complexity module consistently yielding higher fMRI encoding scores across the left-language network. Crucially, curvature emerges as a robust predictor of brain alignment—even after accounting for training progress—and this curvature–alignment coupling strengthens with model scale. The findings suggest that training-driven representational smoothing facilitates neural-like processing and that geometry offers a mechanistic, scalable lens to understand model–brain alignment beyond traditional linguistic features.
Abstract
How large language models (LLMs) align with the neural representation and computation of human language is a central question in cognitive science. Using representational geometry as a mechanistic lens, we addressed this by tracking entropy, curvature, and fMRI encoding scores throughout Pythia (70M-1B) training. We identified a geometric modularization where layers self-organize into stable low- and high-complexity clusters. The low-complexity module, characterized by reduced entropy and curvature, consistently better predicted human language network activity. This alignment followed heterogeneous spatial-temporal trajectories: rapid and stable in temporal regions (AntTemp, PostTemp), but delayed and dynamic in frontal areas (IFG, IFGorb). Crucially, reduced curvature remained a robust predictor of model-brain alignment even after controlling for training progress, an effect that strengthened with model scale. These results links training-driven geometric reorganization to temporal-frontal functional specialization, suggesting that representational smoothing facilitates neural-like linguistic processing.
