Understanding and Controlling Repetition Neurons and Induction Heads in In-Context Learning
Nhi Hoai Doan, Tatsuya Hiraoka, Kentaro Inui
TL;DR
The paper investigates how repetition neurons and induction heads contribute to in-context learning (ICL) and repetitive generation in large language models. It uses layer-wise ablations in Llama-3.1-8B (and other models) to identify repetition neurons via activation differences $\Delta_n = \bar{a}_n - a_n$ and induction heads via prefix-matching scores, revealing a cascade where induction heads detect patterns and late-layer repetition neurons execute predictions. Key findings show that late-layer repetition neurons are critical executors for Pattern recall, a small subset of induction heads serve as primary detectors, and joint ablation of both components nearly collapses pattern recognition; a three-segment middle-layer ablation reduces repetition with minimal ICL disruption. The work introduces a practical, architecture-aware intervention strategy and a generalizable three-step pipeline (pattern detection, neuron/head ranking, targeted ablation) that could extend to controlling memorization, hallucination, or other emergent behaviors in LLMs.
Abstract
This paper investigates the relationship between large language models' (LLMs) ability to recognize repetitive input patterns and their performance on in-context learning (ICL). In contrast to prior work that has primarily focused on attention heads, we examine this relationship from the perspective of skill neurons, specifically repetition neurons. Our experiments reveal that the impact of these neurons on ICL performance varies depending on the depth of the layer in which they reside. By comparing the effects of repetition neurons and induction heads, we further identify strategies for reducing repetitive outputs while maintaining strong ICL capabilities.
