Improving In-Context Learning with Small Language Model Ensembles

M. Mehdi Mojarradi; Lingyi Yang; Robert McCraith; Adam Mahdi

Improving In-Context Learning with Small Language Model Ensembles

M. Mehdi Mojarradi, Lingyi Yang, Robert McCraith, Adam Mahdi

TL;DR

This work tackles the gap between cheap in-context learning and costly fine-tuning by introducing Ensemble SuperICL, which leverages predictions and confidence scores from multiple small language models within the ICL prompt. By concatenating SLM-derived demonstrations and confidences with test inputs, an LLM can better triangulate correct labels without dataset-specific fine-tuning. The approach achieves SoTA performance on several GLUE benchmarks and demonstrates practical gains on a medical labeling task (MedMCQA), with comprehensive ablation and sensitivity analyses confirming the contribution of each component. The method shows promise for efficient domain specialization in real-world settings where resources are constrained.

Abstract

Large language models (LLMs) have shown impressive capabilities across various tasks, but their performance on domain-specific tasks remains limited. While methods like retrieval augmented generation and fine-tuning can help to address this, they require significant resources. In-context learning (ICL) is a cheap and efficient alternative but cannot match the accuracies of advanced methods. We present Ensemble SuperICL, a novel approach that enhances ICL by leveraging the expertise of multiple fine-tuned small language models (SLMs). Ensemble SuperICL achieves state of the art (SoTA) results on several natural language understanding benchmarks. Additionally, we test it on a medical-domain labelling task and showcase its practicality by using off-the-shelf SLMs fine-tuned on a general language task, achieving superior accuracy in large-scale data labelling compared to all baselines. Finally, we conduct an ablation study and sensitivity analyses to elucidate the underlying mechanism of Ensemble SuperICL. Our research contributes to the growing demand for efficient domain specialisation methods in LLMs, offering a cheap and effective method for practitioners.

Improving In-Context Learning with Small Language Model Ensembles

TL;DR

Abstract

Improving In-Context Learning with Small Language Model Ensembles

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)