Scaling Laws for Many-Shot In-Context Learning with Self-Generated Annotations
Zhengyao Gu, Henry Peng Zou, Yankai Chen, Aiwei Liu, Weizhi Zhang, Philip S. Yu
TL;DR
This work investigates scaling in in-context learning (ICL) when using self-generated annotations, proposing a three-step Semi-Supervised ICL framework: annotation generation, demonstration selection, and semi-supervised inference. It introduces Naive-SemiICL, a simple single-iteration baseline that consistently outperforms standard ICL in zero-, few-, and many-shot regimes and reveals a scaling law with optimal performance after around 1,000 demonstrations. Building on this, IterPSD iteratively refines pseudo-demonstrations via curriculum pseudo-labeling and confirmation-bias mitigation, delivering up to 6.8% additional gains on classification tasks. Across 16 datasets spanning classification, translation, and reasoning, the approach demonstrates strong performance under low-resource conditions and highlights the practical potential of pseudo-demonstrations for scalable, cost-efficient ICL.
Abstract
The high cost of obtaining high-quality annotated data for in-context learning (ICL) has motivated the development of methods that use self-generated annotations in place of ground-truth labels. While these approaches have shown promising results in few-shot settings, they generally do not scale to many-shot scenarios. In this work, we study ICL with self-generated examples using a framework analogous to traditional semi-supervised learning, consisting of annotation generation, demonstration selection, and in-context inference. Within this framework, we propose a simple baseline that outperforms ground-truth ICL in zero-shot, few-shot, and many-shot settings. Notably, we observe a scaling law with this baseline, where optimal performance is achieved with more than 1,000 demonstrations. To fully exploit the many-shot capabilities of semi-supervised ICL, we introduce IterPSD, an iterative annotation approach that integrates iterative refinement and curriculum pseudo-labeling techniques from semi-supervised learning, yielding up to 6.8% additional gains on classification tasks.
