Learning to Search Effective Example Sequences for In-Context Learning
Xiang Gao, Ankita Sinha, Kamalika Das
TL;DR
The paper tackles the sensitivity of in-context learning performance to the choice of example sequences by proposing Beam Search-based Example Sequence Constructor (BESC). BESC jointly optimizes length, composition, arrangement, and query dependence through a prefix-suffix dual-encoder scoring model trained with contrastive learning, and uses beam search during inference to efficiently navigate the large sequence space with complexity $O(L \cdot b \cdot c) + O(N) + O(L \cdot b \cdot c \log N)$. Empirical results across six datasets and four language models show that BESC consistently outperforms baselines, with ablations confirming the importance of dynamic examples, adaptive length, arrangement, and sequential modeling. The work also demonstrates transferability of a pretrained BESC scorer and discusses limitations including applicability to open-ended tasks and potential prompting interactions and biases.
Abstract
Large language models (LLMs) demonstrate impressive few-shot learning capabilities, but their performance varies widely based on the sequence of in-context examples. Key factors influencing this include the sequence's length, composition, and arrangement, as well as its relation to the specific query. Existing methods often tackle these factors in isolation, overlooking their interdependencies. Moreover, the extensive search space for selecting optimal sequences complicates the development of a holistic approach. In this work, we introduce Beam Search-based Example Sequence Constructor (BESC), a novel method for learning to construct optimal example sequences. BESC addresses all key factors involved in sequence selection by considering them jointly during inference, while incrementally building the sequence. This design enables the use of beam search to significantly reduce the complexity of the search space. Experiments across various datasets and language models show notable improvements in performance.
