One size doesn't fit all: Predicting the Number of Examples for In-Context Learning
Manish Chandra, Debasis Ganguly, Iadh Ounis
TL;DR
This work tackles the limitation of fixed, one-size-fits-all context sizes in in-context learning by introducing Adaptive In-Context Learning (AICL), which predicts the optimal number of demonstrations $k$ for each test instance. AICL constructs ground-truth targets by evaluating $k$-shot performance for $k \in \{0,\ldots,M\}$, encoding this as a Boolean vector $\mathcal{K}(\boldsymbol{x})$ and training a multi-label classifier to map instance features to this vector; inference selects $\kappa(\boldsymbol{x}) = \arg\max \theta(\boldsymbol{x})$ for the predicted $k$. The method optionally augments input features with the distribution of neighbor labels to further improve the predictor, and experiments across SST2, TREC, CoLA, and RTE with Llama-2 and Phi-2 models show that AICL, especially with neighbor-label information (AICL(E+N)), outperforms fixed-k baselines by up to a substantial margin and generalizes across datasets and model families. The results suggest that per-instance tailoring of context size can substantially enhance LLM-based classification tasks, reducing the need for expensive hyper-parameter tuning and enabling more robust few-shot inference. Overall, AICL provides a practical, data-driven framework to optimize prompt context in diverse NLP tasks and model configurations.
Abstract
In-context learning (ICL) refers to the process of adding a small number of localized examples from a training set of labelled data to an LLM's prompt with an objective to effectively control the generative process seeking to improve the downstream task performance. Existing ICL approaches use an identical number of examples (a pre-configured hyper-parameter) for each data instance. Our work alleviates the limitations of this 'one fits all' approach by dynamically predicting the number of examples for each data instance to be used in few-shot inference with LLMs. In particular, we employ a multi-label classifier, the parameters of which are fitted using a training set, where the label for each instance in this training set indicates if using a specific value of k (number of most similar examples from 0 up to a maximum value) leads to correct k-shot downstream predictions. Our experiments on a number of text classification benchmarks show that AICL substantially outperforms standard ICL by up to 17%.
