Table of Contents
Fetching ...

Understanding In-Context Learning from Repetitions

Jianhao Yan, Jin Xu, Chiyu Song, Chenming Wu, Yafu Li, Yue Zhang

TL;DR

The paper addresses the unclear mechanism behind in-context learning in LLMs by analyzing surface repetitions in demonstrations. It introduces token co-occurrence reinforcement as a key mechanism, arguing that the model learns reinforced connections between tokens through contextual co-occurrences during likelihood maximization. Empirical evidence across multiple models (e.g., LLaMA, OPT) shows that token reinforcement can both constrain the output space and enable pattern-following, but also lead to spurious connections when demonstrations embed non-informative or biased patterns. The findings offer a new, surface-pattern-based lens on ICL, with practical implications for designing demonstrations to maximize beneficial reinforcement while minimizing vulnerability to spurious patterns and selection bias.

Abstract

This paper explores the elusive mechanism underpinning in-context learning in Large Language Models (LLMs). Our work provides a novel perspective by examining in-context learning via the lens of surface repetitions. We quantitatively investigate the role of surface features in text generation, and empirically establish the existence of \emph{token co-occurrence reinforcement}, a principle that strengthens the relationship between two tokens based on their contextual co-occurrences. By investigating the dual impacts of these features, our research illuminates the internal workings of in-context learning and expounds on the reasons for its failures. This paper provides an essential contribution to the understanding of in-context learning and its potential limitations, providing a fresh perspective on this exciting capability.

Understanding In-Context Learning from Repetitions

TL;DR

The paper addresses the unclear mechanism behind in-context learning in LLMs by analyzing surface repetitions in demonstrations. It introduces token co-occurrence reinforcement as a key mechanism, arguing that the model learns reinforced connections between tokens through contextual co-occurrences during likelihood maximization. Empirical evidence across multiple models (e.g., LLaMA, OPT) shows that token reinforcement can both constrain the output space and enable pattern-following, but also lead to spurious connections when demonstrations embed non-informative or biased patterns. The findings offer a new, surface-pattern-based lens on ICL, with practical implications for designing demonstrations to maximize beneficial reinforcement while minimizing vulnerability to spurious patterns and selection bias.

Abstract

This paper explores the elusive mechanism underpinning in-context learning in Large Language Models (LLMs). Our work provides a novel perspective by examining in-context learning via the lens of surface repetitions. We quantitatively investigate the role of surface features in text generation, and empirically establish the existence of \emph{token co-occurrence reinforcement}, a principle that strengthens the relationship between two tokens based on their contextual co-occurrences. By investigating the dual impacts of these features, our research illuminates the internal workings of in-context learning and expounds on the reasons for its failures. This paper provides an essential contribution to the understanding of in-context learning and its potential limitations, providing a fresh perspective on this exciting capability.
Paper Structure (31 sections, 4 equations, 17 figures, 4 tables)

This paper contains 31 sections, 4 equations, 17 figures, 4 tables.

Figures (17)

  • Figure 1: We showcase correct and incorrect predictions of in-context learning of LLaMA-65B. The shown task is to identify whether the given sentence presents a positive sentiment. We involve the token reinforced connections from demonstrations. In both cases, LLMs learn connections from the demonstrations and make decisions based on these connections. In the case of in-context learning, the model learns reliable connections and hopefully several of these connections result in the function of sentiment analysis. On the other hand, in repetitive demonstrations, the model gets stuck to spurious connections and misses the key information 'decreased', leading to a wrong prediction.
  • Figure 2: Left: An example of the self-reinforcement effect. We choose a normal sentence ('Answer is A'), repeat it several times, and present the probability of the token 'A'. The model used is LLaMA-7B. Right: Sentence-level self-reinforcement over LLMs. We plot all sizes of OPT and LLaMA with colors from light to dark. All sizes of LLaMA and OPT models demonstrate strong sentence-level self-reinforcement effects.
  • Figure 3: Token co-occurrence reinforcement. Even if only one token repeats in context, the self-reinforcement loop triggers. "..X..Y.." denotes 2 tokens are kept unchanged. The mean and variance are computed over 1,000 randomly generated samples.
  • Figure 4: Successive and distant reinforcement. The self-reinforcement effect is the strongest when two tokens are successive, i.e., distance=0. Otherwise, the reinforcement is smaller and appears insensitive to the distance. "..X.Y.." denotes the distance between two tokens is 1.
  • Figure 5: The probabilities of next occurrence after several occurrence observed in context.
  • ...and 12 more figures