Table of Contents
Fetching ...

Understanding and Controlling Repetition Neurons and Induction Heads in In-Context Learning

Nhi Hoai Doan, Tatsuya Hiraoka, Kentaro Inui

TL;DR

The paper investigates how repetition neurons and induction heads contribute to in-context learning (ICL) and repetitive generation in large language models. It uses layer-wise ablations in Llama-3.1-8B (and other models) to identify repetition neurons via activation differences $\Delta_n = \bar{a}_n - a_n$ and induction heads via prefix-matching scores, revealing a cascade where induction heads detect patterns and late-layer repetition neurons execute predictions. Key findings show that late-layer repetition neurons are critical executors for Pattern recall, a small subset of induction heads serve as primary detectors, and joint ablation of both components nearly collapses pattern recognition; a three-segment middle-layer ablation reduces repetition with minimal ICL disruption. The work introduces a practical, architecture-aware intervention strategy and a generalizable three-step pipeline (pattern detection, neuron/head ranking, targeted ablation) that could extend to controlling memorization, hallucination, or other emergent behaviors in LLMs.

Abstract

This paper investigates the relationship between large language models' (LLMs) ability to recognize repetitive input patterns and their performance on in-context learning (ICL). In contrast to prior work that has primarily focused on attention heads, we examine this relationship from the perspective of skill neurons, specifically repetition neurons. Our experiments reveal that the impact of these neurons on ICL performance varies depending on the depth of the layer in which they reside. By comparing the effects of repetition neurons and induction heads, we further identify strategies for reducing repetitive outputs while maintaining strong ICL capabilities.

Understanding and Controlling Repetition Neurons and Induction Heads in In-Context Learning

TL;DR

The paper investigates how repetition neurons and induction heads contribute to in-context learning (ICL) and repetitive generation in large language models. It uses layer-wise ablations in Llama-3.1-8B (and other models) to identify repetition neurons via activation differences and induction heads via prefix-matching scores, revealing a cascade where induction heads detect patterns and late-layer repetition neurons execute predictions. Key findings show that late-layer repetition neurons are critical executors for Pattern recall, a small subset of induction heads serve as primary detectors, and joint ablation of both components nearly collapses pattern recognition; a three-segment middle-layer ablation reduces repetition with minimal ICL disruption. The work introduces a practical, architecture-aware intervention strategy and a generalizable three-step pipeline (pattern detection, neuron/head ranking, targeted ablation) that could extend to controlling memorization, hallucination, or other emergent behaviors in LLMs.

Abstract

This paper investigates the relationship between large language models' (LLMs) ability to recognize repetitive input patterns and their performance on in-context learning (ICL). In contrast to prior work that has primarily focused on attention heads, we examine this relationship from the perspective of skill neurons, specifically repetition neurons. Our experiments reveal that the impact of these neurons on ICL performance varies depending on the depth of the layer in which they reside. By comparing the effects of repetition neurons and induction heads, we further identify strategies for reducing repetitive outputs while maintaining strong ICL capabilities.

Paper Structure

This paper contains 27 sections, 1 equation, 27 figures, 12 tables.

Figures (27)

  • Figure 1: Repetition neurons (orange nodes), known for their strong activation in repetitive text generation, play a causal role in few‐shot ICL. Layer-wise ablation shows that deactivating: initial layers→ negligible effect on ICL or generation; middle layers→small drop in ICL recall and reduced repetition; last layers→severe ICL failure but degrades repetitive generation.
  • Figure 2: Distribution of top 31 (3%) induction heads and top 1000 (0.2%) repetition neurons across layer.
  • Figure 3: Model performance after ablating repetition neurons (solid lines) versus randomly selected neurons (dashed line) by layer-wise with the 10-shot ICL setting. The performance is separately reported for the Pattern (top) and Non-Pattern (bottom). Data points with 0 on the X-axis show the performance without ablation.
  • Figure 4: Llama-3.1-8B performance on the Pattern (top) and Non-pattern (bottom) subsets after ablating induction heads with the top highest prefix-matching scores on two classes in the 10-shot setting. Data points with 0 on the X-axis show the performance without ablation.
  • Figure 5: $\Delta$Recall for Pattern (blue) and Non-Pattern (orange) inputs on five tasks under four joint‐ablation regimes: (a) repetition neurons(top 250 from final segment [0.8–1.0]) $\times$ induction heads (top 3% by prefix-matching score), (b) repetition neurons $\times$ random heads, (c) random neurons $\times$ induction heads, and (d) random neurons $\times$ random heads with 10-shot. The negative value means the performance drop after ablation.
  • ...and 22 more figures