How Language Models Prioritize Contextual Grammatical Cues?

Hamidreza Amirzadeh; Afra Alishahi; Hosein Mohebbi

How Language Models Prioritize Contextual Grammatical Cues?

Hamidreza Amirzadeh, Afra Alishahi, Hosein Mohebbi

TL;DR

This paper investigates how language models handle gender agreement when multiple gender cue words are present, each capable of independently disambiguating a target gender pronoun and reveals striking differences in how encoder-based and decoder-based models prioritize and use contextual information for their predictions.

Abstract

Transformer-based language models have shown an excellent ability to effectively capture and utilize contextual information. Although various analysis techniques have been used to quantify and trace the contribution of single contextual cues to a target task such as subject-verb agreement or coreference resolution, scenarios in which multiple relevant cues are available in the context remain underexplored. In this paper, we investigate how language models handle gender agreement when multiple gender cue words are present, each capable of independently disambiguating a target gender pronoun. We analyze two widely used Transformer-based models: BERT, an encoder-based, and GPT-2, a decoder-based model. Our analysis employs two complementary approaches: context mixing analysis, which tracks information flow within the model, and a variant of activation patching, which measures the impact of cues on the model's prediction. We find that BERT tends to prioritize the first cue in the context to form both the target word representations and the model's prediction, while GPT-2 relies more on the final cue. Our findings reveal striking differences in how encoder-based and decoder-based models prioritize and use contextual information for their predictions.

How Language Models Prioritize Contextual Grammatical Cues?

TL;DR

Abstract

Paper Structure (22 sections, 27 figures, 4 tables)

This paper contains 22 sections, 27 figures, 4 tables.

Introduction
Related Work
Context mixing.
Mechanistic interpretability.
Experimental Setup
Data
Target models
Model input setup
Which cue does the model rely on to form a target representation?
Setup
Results
Which cue does the model rely on to predict a target word?
Setup
Results
Conclusion
...and 7 more sections

Figures (27)

Figure 1: Value Zeroing scores for the pre-trained (top row) and fine-tuned (bottom row) BERT across different numbers of cue words.
Figure 2: Value Zeroing scores for the pre-trained (top row) and fine-tuned (bottom row) GPT-2 across different numbers of cue words.
Figure 3: Value Zeroing scores for constructing target token representation in a test example for fine-tuned models. Cue words are highlighted in bold.
Figure 4: Value patching scores for the fine-tuned BERT (top row) and fine-tuned GPT-2 (bottom row) across different numbers of cue words.
Figure 5: Value patching scores for a test example for fine-tuned models. Cue words are highlighted in bold.
...and 22 more figures

How Language Models Prioritize Contextual Grammatical Cues?

TL;DR

Abstract

How Language Models Prioritize Contextual Grammatical Cues?

Authors

TL;DR

Abstract

Table of Contents

Figures (27)