Naive Bayes-based Context Extension for Large Language Models
Jianlin Su, Murtadha Ahmed, Wenbo, Luo Ao, Mingren Zhu, Yunfeng Liu
TL;DR
This work tackles the limitation of in-context learning in large language models by introducing Naive Bayes-based Context Extension (NBCE), which partitions demonstrations into multiple equal-length windows, selects a posterior window via voting, and applies Bayes' theorem to generate test outputs without any fine-tuning. The approach yields a practical, linear-cost method to broaden contextual supervision by approximating $p(T|S_1,...,S_n) \propto p(T)\prod_{k=1}^n p(S_k|T)$ and refining it with pooling and a beta parameter, enabling scalable use across models and tasks. Experimental results across text classification and multi-choice benchmarks demonstrate NBCE often outperforms the parallel context window baseline (PCW), with especially strong gains as the number of classes rises and with larger models, while maintaining stability. The work also provides ablations and analyses of pooling strategies and hyperparameters, and releases code to facilitate adoption and further study of context extension in ICL.
Abstract
Large Language Models (LLMs) have shown promising in-context learning abilities. However, conventional In-Context Learning (ICL) approaches are often impeded by length limitations of transformer architecture, which pose challenges when attempting to effectively integrate supervision from a substantial number of demonstration examples. In this paper, we introduce a novel framework, called Naive Bayes-based Context Extension (NBCE), to enable existing LLMs to perform ICL with an increased number of demonstrations by significantly expanding their context size. Importantly, this expansion does not require fine-tuning or dependence on particular model architectures, all the while preserving linear efficiency. NBCE initially splits the context into equal-sized windows fitting the target LLM's maximum length. Then, it introduces a voting mechanism to select the most relevant window, regarded as the posterior context. Finally, it employs Bayes' theorem to generate the test task. Our experimental results demonstrate that NBCE substantially enhances performance, particularly as the number of demonstration examples increases, consistently outperforming alternative methods. The NBCE code will be made publicly accessible. The code NBCE is available at: https://github.com/amurtadha/NBCE-master
