Understanding In-Context Learning from Repetitions
Jianhao Yan, Jin Xu, Chiyu Song, Chenming Wu, Yafu Li, Yue Zhang
TL;DR
The paper addresses the unclear mechanism behind in-context learning in LLMs by analyzing surface repetitions in demonstrations. It introduces token co-occurrence reinforcement as a key mechanism, arguing that the model learns reinforced connections between tokens through contextual co-occurrences during likelihood maximization. Empirical evidence across multiple models (e.g., LLaMA, OPT) shows that token reinforcement can both constrain the output space and enable pattern-following, but also lead to spurious connections when demonstrations embed non-informative or biased patterns. The findings offer a new, surface-pattern-based lens on ICL, with practical implications for designing demonstrations to maximize beneficial reinforcement while minimizing vulnerability to spurious patterns and selection bias.
Abstract
This paper explores the elusive mechanism underpinning in-context learning in Large Language Models (LLMs). Our work provides a novel perspective by examining in-context learning via the lens of surface repetitions. We quantitatively investigate the role of surface features in text generation, and empirically establish the existence of \emph{token co-occurrence reinforcement}, a principle that strengthens the relationship between two tokens based on their contextual co-occurrences. By investigating the dual impacts of these features, our research illuminates the internal workings of in-context learning and expounds on the reasons for its failures. This paper provides an essential contribution to the understanding of in-context learning and its potential limitations, providing a fresh perspective on this exciting capability.
