Large Language Models Know What Makes Exemplary Contexts
Quanyu Long, Jianda Chen, Wenya Wang, Sinno Jialin Pan
TL;DR
This work tackles in-context learning by enabling LLMs to self-select and order demonstrations through a sequential retrieval process guided by a reward model trained on the LLM’s preferences. A parameter-efficient retrieval head, initialized from LLM embeddings, selects $k$ demonstrations in an autoregressive manner, while a reward head trained via pairwise preferences (Bradley–Terrry) provides stable feedback to a PPO-based reinforcement learning update of the retrieval head. The approach yields improved ICL performance across 11 tasks, demonstrates increased representativeness and diversity of retrieved demonstrations, and shows transferability of the learned retrieval policy across LLMs. By keeping the LLM frozen and updating only the retrieval and reward heads, the method offers a scalable, self-consistent way to optimize context for diverse tasks with potential applicability to broader retrieval-augmented AI systems.
Abstract
In-context learning (ICL) has proven to be a significant capability with the advancement of Large Language models (LLMs). By instructing LLMs using few-shot demonstrative examples, ICL enables them to perform a wide range of tasks without needing to update millions of parameters. This paper presents a unified framework for LLMs that allows them to self-select influential in-context examples to compose their contexts; self-rank candidates with different demonstration compositions; self-optimize the demonstration selection and ordering through reinforcement learning. Specifically, our method designs a parameter-efficient retrieval head that generates the optimized demonstration after training with rewards from LLM's own preference. Experimental results validate the proposed method's effectiveness in enhancing ICL performance. Additionally, our approach effectively identifies and selects the most representative examples for the current task, and includes more diversity in retrieval.
