Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning

Lean Wang; Lei Li; Damai Dai; Deli Chen; Hao Zhou; Fandong Meng; Jie Zhou; Xu Sun

Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning

Lean Wang, Lei Li, Damai Dai, Deli Chen, Hao Zhou, Fandong Meng, Jie Zhou, Xu Sun

TL;DR

The paper investigates in-context learning (ICL) through an information-flow lens, arguing that label words act as anchors that gather information in shallow transformer layers and drive final predictions in deeper layers. It introduces saliency-based metrics and a formal two-part anchor hypothesis (H1/H2), validated with GPT2-XL and GPT-J across multiple classification tasks. Leveraging this anchor view, it proposes three practical enhancements: anchor re-weighting to boost accuracy, anchor-only context compression to accelerate inference, and an anchor-distance framework for diagnosing ICL errors, demonstrating improvements in efficiency and interpretability. Overall, the work provides a cohesive mechanistic account of ICL and actionable techniques to improve and diagnose ICL in large language models.

Abstract

In-context learning (ICL) emerges as a promising capability of large language models (LLMs) by providing them with demonstration examples to perform diverse tasks. However, the underlying mechanism of how LLMs learn from the provided context remains under-explored. In this paper, we investigate the working mechanism of ICL through an information flow lens. Our findings reveal that label words in the demonstration examples function as anchors: (1) semantic information aggregates into label word representations during the shallow computation layers' processing; (2) the consolidated information in label words serves as a reference for LLMs' final predictions. Based on these insights, we introduce an anchor re-weighting method to improve ICL performance, a demonstration compression technique to expedite inference, and an analysis framework for diagnosing ICL errors in GPT2-XL. The promising applications of our findings again validate the uncovered ICL working mechanism and pave the way for future studies.

Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning

TL;DR

Abstract

Paper Structure (43 sections, 13 equations, 14 figures, 6 tables)

This paper contains 43 sections, 13 equations, 14 figures, 6 tables.

Introduction
Label Words are Anchors
Hypothesis Motivated by Saliency Scores
Experimental Settings
Results and Analysis
Proposed Hypothesis
Shallow Layers: Information Aggregation
Experimental Settings
Implementation Details
Metrics
Results and Analysis
Deep Layers: Information Extraction
Experiments
Results and Analysis
Discussion of Our Hypothesis
...and 28 more sections

Figures (14)

Figure 1: Visualization of the information flow in a GPT model performing ICL. The line depth reflects the significance of the information flow from the right word to the left. The flows involving label words are highlighted. Label words gather information from demonstrations in shallow layers, which is then extracted in deep layers for final prediction.
Figure 2: Illustration of our hypothesis. In shallow layers, label words gather information from demonstrations to form semantic representations for deeper processing, while deep layers extract and utilize this information from label words to formulate the final prediction.
Figure 3: Relative sizes of $S_{wp}$, $S_{pq}$, and $S_{ww}$ in different layers on SST-2 and AGNews. Results of other datasets can be found in Appendix \ref{['sec:appdendix_trec_emoc']}. Initially, $S_{wp}$ occupies a significant proportion, but it gradually decays over layers, while $S_{pq}$ becomes the dominant one.
Figure 4: The impact of isolating label words versus randomly isolating non-label words within the first or last 5 layers. Isolating label words within the first 5 layers exerts the most substantial impact, highlighting the importance of shallow-layer information aggregation via label words.
Figure 5: $\text{AUCROC}_l$ and $R_l$ of each layer in GPT models. The result is averaged over SST-2, TREC, AGNews, and Emoc. $\text{AUCROC}_l$ reaches 0.8 in deep layers, and $R_l$ increases mainly in the middle and later layers.
...and 9 more figures

Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning

TL;DR

Abstract

Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (14)