Not the Example, but the Process: How Self-Generated Examples Enhance LLM Reasoning
Daehoon Gwak, Minseo Jung, Junwoo Park, Minho Park, ChaeHun Park, Junha Hyung, Jaegul Choo
TL;DR
Not the Example, but the Process investigates why self-generated examples improve LLM reasoning, hypothesizing that the problem-creation process, not the final examples themselves, drives gains. It systematically compares Zero-shot, Integrated, and Decoupled prompting across five architectures on MATH and GSM8K, with attention analyses to reveal internal mechanisms. The main finding is that Integrated prompting, which couples problem generation and solving, yields the best performance, while Decoupled offers only marginal gains. The results provide design guidance for prompting strategies in complex reasoning tasks and suggest that fostering the problem-creation process can reduce reliance on manual exemplars.
Abstract
Recent studies have shown that Large Language Models (LLMs) can improve their reasoning performance through self-generated few-shot examples, achieving results comparable to manually curated in-context examples. However, the underlying mechanism behind these gains remains unclear, making it hard to decide when and how to apply the technique effectively. In this work, we argue that the key benefit arises not from the generated examples themselves but from the act of creating them. To validate this, on reasoning-intensive tasks across diverse LLM architectures, we systematically evaluate three prompting strategies for in-context learning: (1) Zero-shot prompting; (2) Integrated prompting, where LLMs create and solve problems within a single, unified prompt; and (3) Decoupled prompting, where self-generated examples are reused as in-context examples, but the context of their creation itself is excluded. We conduct experiments across five widely used model architectures, demonstrating that Integrated prompting consistently outperforms both Zero-shot and Decoupled prompting. In contrast, Decoupled prompting offers only marginal gains over Zero-shot. Further, for a more in-depth analysis, we conduct an attention analysis and observe significant differences in attention patterns between Integrated and Decoupled prompting. These findings suggest that the advantage of self-generation prompting comes from the process of problem creation, not the examples themselves, providing valuable insights for designing more effective prompting strategies.
