Repetitions are not all alike: distinct mechanisms sustain repetition in language models
Matéo Mahaut, Francesca Franzon
TL;DR
The paper investigates why language models repeatedly generate identical sequences by disentangling multiple underlying mechanisms and their development during training. It contrasts natural repetition with in-context learning (ICL) induced repetition, using developmental trajectories, attention-head activations, and confidence (entropy) analyses to reveal distinct circuitry and dynamics. The findings show that ICL repetition relies on a dedicated, progressively specialized attention-head circuit (with late MLP involvement), whereas natural repetition arises early without a defined circuitry and often focuses on low-information tokens, suggesting a fallback strategy. These results demonstrate that superficially similar repetition behaviors in LLMs originate from qualitatively different internal processes, with implications for mitigation and robust generation across tasks.
Abstract
Large Language Models (LLMs) can sometimes degrade into repetitive loops, persistently generating identical word sequences. Because repetition is rare in natural human language, its frequent occurrence across diverse tasks and contexts in LLMs remains puzzling. Here we investigate whether behaviorally similar repetition patterns arise from distinct underlying mechanisms and how these mechanisms develop during model training. We contrast two conditions: repetitions elicited by natural text prompts with those induced by in-context learning (ICL) setups that explicitly require copying behavior. Our analyses reveal that ICL-induced repetition relies on a dedicated network of attention heads that progressively specialize over training, whereas naturally occurring repetition emerges early and lacks a defined circuitry. Attention inspection further shows that natural repetition focuses disproportionately on low-information tokens, suggesting a fallback behavior when relevant context cannot be retrieved. These results indicate that superficially similar repetition behaviors originate from qualitatively different internal processes, reflecting distinct modes of failure and adaptation in language models.
