Exploring Similarity between Neural and LLM Trajectories in Language Processing
Xin Xiao, Kaiwen Wei, Jiang Zhong, Xuekai Wei, Mingliang Zhou
TL;DR
This work addresses how brain activity during language comprehension relates to the evolving internal representations of large language models (LLMs). The authors quantify both representational similarity and dynamical trajectory alignment between EEG data and 16 public LLMs across English and Chinese, employing ridge regression with RSA/CKA and a Latent Trajectory Comparison framework that includes Magnitude, Angle, Uncertainty, Confidence, and MI, plus the Dynamic Representational Alignment metric. Key findings show that middle-to-high LLM layers contribute to semantic integration resembling the brain's N400 component, while brain activity remains continuous whereas LLMs exhibit discrete, stage-like bursts; multilingual alignment is stronger for English than Chinese, likely due to instruction-tuning data biases. The study provides a framework for assessing brain–LLM alignment beyond static representations, highlighting both shared semantic processing mechanisms and fundamental differences in temporal dynamics, which informs future multilingual model design and cognitive neuroscience research.
Abstract
Understanding the similarity between large language models (LLMs) and human brain activity is crucial for advancing both AI and cognitive neuroscience. In this study, we provide a multilinguistic, large-scale assessment of this similarity by systematically comparing 16 publicly available pretrained LLMs with human brain responses during natural language processing tasks in both English and Chinese. Specifically, we use ridge regression to assess the representational similarity between LLM embeddings and electroencephalography (EEG) signals, and analyze the similarity between the "neural trajectory" and the "LLM latent trajectory." This method captures key dynamic patterns, such as magnitude, angle, uncertainty, and confidence. Our findings highlight both similarities and crucial differences in processing strategies: (1) We show that middle-to-high layers of LLMs are central to semantic integration and correspond to the N400 component observed in EEG; (2) The brain exhibits continuous and iterative processing during reading, whereas LLMs often show discrete, stage-end bursts of activity, which suggests a stark contrast in their real-time semantic processing dynamics. This study could offer new insights into LLMs and neural processing, and also establish a critical framework for future investigations into the alignment between artificial intelligence and biological intelligence.
