Table of Contents
Fetching ...

Measuring Meaning Composition in the Human Brain with Composition Scores from Large Language Models

Changjiang Gao, Jixing Li, Jiajun Chen, Shujian Huang

TL;DR

This work introduces the Composition Score, a model-based metric derived from the key-value memory interpretation of transformer FFN blocks to quantify meaning composition during sentence processing. By computing layer-wise, per-neuron vocabulary distributions and comparing them to the FFN output via Jensen–Shannon distances, the authors assess how distributed or sparse the compositional process is, then relate these scores to fMRI data collected during naturalistic listening to The Little Prince. Across 32 layers of LLaMA2 models, Composition Scores show distinct patterns and stronger brain associations than several control variables (word rate, frequency, and node counts), with significant clusters in the left inferior frontal gyrus and left posterior temporal regions, suggesting a multifaceted neural basis for composition involving word frequency, structural processing, and general word sensitivity. The study demonstrates that model-derived compositions reflect brain processes beyond simple word memorization, contributing a novel quantitative bridge between LLM internal states and human neural mechanisms of meaning assembly, while acknowledging limitations such as generalizability across models and languages.

Abstract

The process of meaning composition, wherein smaller units like morphemes or words combine to form the meaning of phrases and sentences, is essential for human sentence comprehension. Despite extensive neurolinguistic research into the brain regions involved in meaning composition, a computational metric to quantify the extent of composition is still lacking. Drawing on the key-value memory interpretation of transformer feed-forward network blocks, we introduce the Composition Score, a novel model-based metric designed to quantify the degree of meaning composition during sentence comprehension. Experimental findings show that this metric correlates with brain clusters associated with word frequency, structural processing, and general sensitivity to words, suggesting the multifaceted nature of meaning composition during human sentence comprehension.

Measuring Meaning Composition in the Human Brain with Composition Scores from Large Language Models

TL;DR

This work introduces the Composition Score, a model-based metric derived from the key-value memory interpretation of transformer FFN blocks to quantify meaning composition during sentence processing. By computing layer-wise, per-neuron vocabulary distributions and comparing them to the FFN output via Jensen–Shannon distances, the authors assess how distributed or sparse the compositional process is, then relate these scores to fMRI data collected during naturalistic listening to The Little Prince. Across 32 layers of LLaMA2 models, Composition Scores show distinct patterns and stronger brain associations than several control variables (word rate, frequency, and node counts), with significant clusters in the left inferior frontal gyrus and left posterior temporal regions, suggesting a multifaceted neural basis for composition involving word frequency, structural processing, and general word sensitivity. The study demonstrates that model-derived compositions reflect brain processes beyond simple word memorization, contributing a novel quantitative bridge between LLM internal states and human neural mechanisms of meaning assembly, while acknowledging limitations such as generalizability across models and languages.

Abstract

The process of meaning composition, wherein smaller units like morphemes or words combine to form the meaning of phrases and sentences, is essential for human sentence comprehension. Despite extensive neurolinguistic research into the brain regions involved in meaning composition, a computational metric to quantify the extent of composition is still lacking. Drawing on the key-value memory interpretation of transformer feed-forward network blocks, we introduce the Composition Score, a novel model-based metric designed to quantify the degree of meaning composition during sentence comprehension. Experimental findings show that this metric correlates with brain clusters associated with word frequency, structural processing, and general sensitivity to words, suggesting the multifaceted nature of meaning composition during human sentence comprehension.
Paper Structure (44 sections, 6 equations, 12 figures, 2 tables)

This paper contains 44 sections, 6 equations, 12 figures, 2 tables.

Figures (12)

  • Figure 1: Comparing Composition Scores with fMRI data during naturalistic listening comprehension.
  • Figure 2: The average Composition Score of each layer of the LLaMA2 models and a randomly initialized model.
  • Figure 3: (a) Density plot of word frequency, node counts based on the top-down, bottom-up and left-corner node counts. Note that density plot is different from a histogram such that values on the y-axis here represent probability density and the total area under the curve integrates to one. (b) Correlation matrix among the 4 control variables.
  • Figure 4: Correlation matrix among the 32 layers of LLaMA2-chat.
  • Figure 5: The regression scores $R^2$ between the Composition Scores from LLaMA2-chat and the control variables.
  • ...and 7 more figures