On the Shape of Brainscores for Large Language Models (LLMs)
Jingkai Li
TL;DR
The paper tackles how to meaningfully interpret the Brainscore metric that aligns LLM representations with human brain activity. It introduces a topological feature framework based on Persistent Homology and $q$-Wasserstein distances to compare fMRI data and LLM embeddings, then uses cross-validated Linear Regression to identify reliable, interpretable feature combinations across ROIs and hemispheres. The results reveal distinct feature sets that aid Brainscore interpretation, while highlighting limitations in predictive power and suggesting directions to improve stability, including tuning $q$, $p$, and regression approaches. Overall, the work advances interpretable cross-domain analysis between neural data and large language models, with potential implications for developing brain-like AI representations and evaluation metrics.
Abstract
With the rise of Large Language Models (LLMs), the novel metric "Brainscore" emerged as a means to evaluate the functional similarity between LLMs and human brain/neural systems. Our efforts were dedicated to mining the meaning of the novel score by constructing topological features derived from both human fMRI data involving 190 subjects, and 39 LLMs plus their untrained counterparts. Subsequently, we trained 36 Linear Regression Models and conducted thorough statistical analyses to discern reliable and valid features from our constructed ones. Our findings reveal distinctive feature combinations conducive to interpreting existing brainscores across various brain regions of interest (ROIs) and hemispheres, thereby significantly contributing to advancing interpretable machine learning (iML) studies. The study is enriched by our further discussions and analyses concerning existing brainscores. To our knowledge, this study represents the first attempt to comprehend the novel metric brainscore within this interdisciplinary domain.
