Language models and brains align due to more than next-word prediction and word-level information

Gabriele Merlin; Mariya Toneva

Language models and brains align due to more than next-word prediction and word-level information

Gabriele Merlin, Mariya Toneva

TL;DR

This work contrasts the brain alignment of these differently perturbed models and shows that improvements in alignment with brain recordings are due to more than improvements in next-word prediction and word-level information.

Abstract

Pretrained language models have been shown to significantly predict brain recordings of people comprehending language. Recent work suggests that the prediction of the next word is a key mechanism that contributes to this alignment. What is not yet understood is whether prediction of the next word is necessary for this observed alignment or simply sufficient, and whether there are other shared mechanisms or information that are similarly important. In this work, we take a step towards understanding the reasons for brain alignment via two simple perturbations in popular pretrained language models. These perturbations help us design contrasts that can control for different types of information. By contrasting the brain alignment of these differently perturbed models, we show that improvements in alignment with brain recordings are due to more than improvements in next-word prediction and word-level information.

Language models and brains align due to more than next-word prediction and word-level information

TL;DR

Abstract

Paper Structure (32 sections, 6 equations, 25 figures)

This paper contains 32 sections, 6 equations, 25 figures.

Introduction
Methods
Baseline models
fMRI data
Evaluation tasks
Next-word prediction.
Brain alignment.
Perturbations
Input scrambling.
Stimulus-tuning.
Contrasts to disentangle brain alignment factors
Results
Next-word prediction
Brain alignment
Effects of stimulus-tuning.
...and 17 more sections

Figures (25)

Figure 1: An illustration of additional information that may be important for alignment between language models and brain recordings. Our approach is largely agnostic about the exact linguistic information contained in the conceptual quantities "word-level information" and "multi-word information", and the only assumption is that "word-level information" is not affected by word order.
Figure 2: Performances of the GPT-2-small baseline and perturbed models at next-word prediction averaged across runs with standard deviation (A) and brain alignment (C-F). Stimulus-tuning improves both the next-word prediction (stimulus-tuned vs baseline in (A)) and brain alignment (D). Instead, scrambling reduces the next-word prediction (baseline vs baseline scrambled in (A)) and reduces the brain alignment (E and F). Despite the reduction in alignment due to the scrambling perturbation, all four models exhibit alignment in language processing regions (B) (see Appendix Figure \ref{['fig:appendix_all_subjects_gpt2_small']} for brain alignment plots for all participants and Appendix Figures \ref{['fig:appendix_all_subjects_gpt2_distill']}, \ref{['fig:appendix_all_subjects_gpt2_medium']} for other models.
Figure 3: Impact of the stimulus-tuning perturbation on the baseline model. For each model (GPT-2-small, medium, distill) we computed the median difference in language and non-language regions across participants. Here we display the average difference across models as well as the standard deviation. Results for the single models are reported in Appendix Figures \ref{['fig:whole_text_vs_base_gpt2_small']}, \ref{['fig:whole_text_vs_base_gpt2_distill']}, \ref{['fig:whole_text_vs_base_gpt2_medium']}.
Figure 4: Impact of the stimulus-tuning perturbation on the baseline model. For each model (GPT-2-small, medium, distill) we computed the median percentage gain by stimulus-tuned over baseline in language regions across participants. Here we display the average percentage gain across models as well as the standard deviation. We include only voxels with estimated noise ceiling values >0.05. Results for the single models are reported in Appendix Figures \ref{['fig:roi_text_vs_base_gpt2_small']}, \ref{['fig:roi_text_vs_base_gpt2_distill']}, \ref{['fig:roi_text_vs_base_gpt2_medium']}.
Figure 5: Impact of the scrambling perturbation on the stimulus-tuned model versus its impact on the baseline model. For each model (GPT-2-small, medium, distill) we computed the median percentage gain by (stimulus-tuned - stimulus-tuned scrambled) over (baseline - baseline scrambled) in language regions across participants. Here we display the average percentage gain across models, as well as the standard deviation. We include only voxels with estimated noise ceiling values >0.05. Results for the single models are reported in Appendix Figures \ref{['fig:roi_final_gpt2_small']}, \ref{['fig:roi_final_gpt2_distill']}, \ref{['fig:roi_final_gpt2_medium']}.
...and 20 more figures

Language models and brains align due to more than next-word prediction and word-level information

TL;DR

Abstract

Language models and brains align due to more than next-word prediction and word-level information

Authors

TL;DR

Abstract

Table of Contents

Figures (25)