Active Few-Shot Learning for Text Classification
Saeed Ahmadnia, Arash Yousefi Jordehi, Mahsa Hosseini Khasheh Heyran, Seyed Abolghasem Mirroshandel, Owen Rambow, Cornelia Caragea
TL;DR
Active Few-Shot Learning for Text Classification introduces an iterative active-learning pipeline that selects $M$ unlabeled samples per iteration from a pool using hybrid uncertainty and representativeness criteria, augmenting the $K$-sized support set during fine-tuning of large language models. It leverages two embedding streams, En (encoder-based) and Sc (score-based), and five sampling strategies (Random, Rep, Un, UnRep, ClUn) to balance uncertainty, diversity, and representativeness. Evaluated on five tasks with BART and FLAN-T5, the method consistently outperforms random and non-iterative baselines, with FLAN-T5-Rep(En)-ClUn(En) often providing the strongest gains; in-context learning generally lags behind iterative fine-tuning but offers complementary insights. The work demonstrates meaningful performance gains with a transparent, open-source implementation and discusses practical trade-offs between accuracy and efficiency, highlighting opportunities for future enhancements such as additional embeddings, semi-supervised extensions, and multilingual extension.
Abstract
The rise of Large Language Models (LLMs) has boosted the use of Few-Shot Learning (FSL) methods in natural language processing, achieving acceptable performance even when working with limited training data. The goal of FSL is to effectively utilize a small number of annotated samples in the learning process. However, the performance of FSL suffers when unsuitable support samples are chosen. This problem arises due to the heavy reliance on a limited number of support samples, which hampers consistent performance improvement even when more support samples are added. To address this challenge, we propose an active learning-based instance selection mechanism that identifies effective support instances from the unlabeled pool and can work with different LLMs. Our experiments on five tasks show that our method frequently improves the performance of FSL. We make our implementation available on GitHub.
