In-Context Example Ordering Guided by Label Distributions
Zhichao Xu, Daniel Cohen, Bei Wang, Vivek Srikumar
TL;DR
This work tackles the sensitivity of in-context learning (ICL) to the order of in-context examples by formulating the ordering task as an optimization problem. It introduces Probability Distribution Ordering (PDO), which uses two priors inspired by Learning from Label Proportions to select performative orderings based on model probability distributions, enabling both Direct and PMI scoring and accommodating FewShot, FewShotU, and FewShotUP settings. Across 13 text classification datasets and 9 autoregressive LLMs, PDO consistently improves accuracy and reduces calibration error, while also enabling effective task-level exemplar selection without labeled development data. The approach is lightweight, generalizable across models and scoring schemes, and has practical implications for deploying calibrated ICL in real-world tasks.
Abstract
By allowing models to predict without task-specific training, in-context learning (ICL) with pretrained LLMs has enormous potential in NLP. However, a number of problems persist in ICL. In particular, its performance is sensitive to the choice and order of in-context examples. Given the same set of in-context examples with different orderings, model performance may vary between near random to near state-of-the-art. In this work, we formulate in-context example ordering as an optimization problem. We examine three problem settings that differ in the assumptions they make about what is known about the task. Inspired by the idea of learning from label proportions, we propose two principles for in-context example ordering guided by model's probability predictions. We apply our proposed principles to thirteen text classification datasets and nine different autoregressive LLMs with 700M to 13B parameters. We demonstrate that our approach outperforms the baselines by improving the classification accuracy, reducing model miscalibration, and also by selecting better in-context examples.
