Continual Lifelong Learning in Natural Language Processing: A Survey
Magdalena Biesialska, Katarzyna Biesialska, Marta R. Costa-jussà
TL;DR
This survey assembles the state of continual lifelong learning (CL) in NLP, detailing the problem of catastrophic forgetting and the stability-plasticity trade-off. It classifies CL methods into rehearsal, regularization, memory, distillation, and architectural families, and discusses their applicability to NLP tasks. It reviews evaluation protocols, NLP benchmarks, and datasets, highlighting the lack of standardized NLP-specific CL standards and proposing key metrics for progress. It then surveys CL applications across NLP tasks (embeddings, language modeling, QA, sentiment analysis, and machine translation) and outlines critical gaps and future directions, including task-agnostic CL, causal reasoning, efficiency, and robust benchmarks.
Abstract
Continual learning (CL) aims to enable information systems to learn from a continuous data stream across time. However, it is difficult for existing deep learning architectures to learn a new task without largely forgetting previously acquired knowledge. Furthermore, CL is particularly challenging for language learning, as natural language is ambiguous: it is discrete, compositional, and its meaning is context-dependent. In this work, we look at the problem of CL through the lens of various NLP tasks. Our survey discusses major challenges in CL and current methods applied in neural network models. We also provide a critical review of the existing CL evaluation methods and datasets in NLP. Finally, we present our outlook on future research directions.
