Submodular Context Partitioning and Compression for In-Context Learning
Shaoyi Zheng, Canyu Zhang, Tianyi Zhou, Shengjie Wang
TL;DR
Sub-CP introduces a submodular, block-aware context selection framework for in-context learning that controls block diversity from global coverage to local coherence. By employing the facility location function and four partitioning strategies, it enables precomputation and plug-and-play integration with ICAE, DENSE, and CEPE, yielding consistent improvements across five datasets and multiple model scales. The approach balances information coverage and redundancy, and analyses reveal robust gains on hard tasks while outlining areas for enhancement such as query-aware selection and explicit balance constraints. Overall, Sub-CP provides a principled, flexible method to structure exemplars for scalable, effective ICL in large language models.
Abstract
In-context learning (ICL) enables efficient few-shot learning in large language models (LLMs) without training, but suffers from the quadratic input complexity of transformers, limiting the maximum number of exemplars. While various efficient ICL approaches partition the context into blocks to process (e.g., ensembling, compression, cross-attention), they often ignore the information redundancy or under-representation caused by different partition strategies, leading to suboptimal performance. To tackle this problem, we propose Sub-CP, a block-aware context selection framework that leverages submodular objectives to control block diversity. Sub-CP supports a flexible spectrum of selection strategies, allowing each block to range from globally diverse to locally coherent. This allows fine-grained control over semantic structure while enabling precomputation. Extensive experiments across diverse tasks on multiple datasets show that Sub-CP consistently improves performance across model scales.
