In-context Continual Learning Assisted by an External Continual Learner
Saleh Momeni, Sahisnu Mazumder, Zixuan Ke, Bing Liu
TL;DR
This work tackles the scalability and performance limitations of in-context continual learning (ICL) for class-incremental NLP by introducing InCA, which couples an External Continual Learner (ECL) with in-context prompts. The ECL builds Gaussian representations for each class using SBERT embeddings of semantic tags and usesMahalanobis distance to select a compact top-$k$ set of candidate classes, which then informs the LLM-based final prediction via per-class summaries. Importantly, InCA remains replay-free and does not update any LLM parameters, mitigating catastrophic forgetting while avoiding excessive prompt length. Across four datasets, InCA outperforms traditional fine-tuning baselines and remains competitive under data-constrained scenarios, demonstrating the practical value of combining an external, non-training-based classifier with in-context learning for scalable CIL.
Abstract
Existing continual learning (CL) methods mainly rely on fine-tuning or adapting large language models (LLMs). They still suffer from catastrophic forgetting (CF). Little work has been done to exploit in-context learning (ICL) to leverage the extensive knowledge within LLMs for CL without updating any parameters. However, incrementally learning each new task in ICL necessitates adding training examples from each class of the task to the prompt, which hampers scalability as the prompt length increases. This issue not only leads to excessively long prompts that exceed the input token limit of the underlying LLM but also degrades the model's performance due to the overextended context. To address this, we introduce InCA, a novel approach that integrates an external continual learner (ECL) with ICL to enable scalable CL without CF. The ECL is built incrementally to pre-select a small subset of likely classes for each test instance. By restricting the ICL prompt to only these selected classes, InCA prevents prompt lengths from becoming excessively long, while maintaining high performance. Experimental results demonstrate that InCA significantly outperforms existing CL baselines, achieving substantial performance gains.
