Towards Lifelong Learning of Large Language Models: A Survey

Junhao Zheng; Shengjie Qiu; Chengming Shi; Qianli Ma

Towards Lifelong Learning of Large Language Models: A Survey

Junhao Zheng, Shengjie Qiu, Chengming Shi, Qianli Ma

TL;DR

This survey addresses lifelong learning for large language models by introducing a two-pronged taxonomy of Internal Knowledge and External Knowledge, and by organizing the literature into 12 distinct learning scenarios. It synthesizes core technique families—replay, regularization, architecture, and distillation—across continual pretraining and continual finetuning, and extends to external knowledge via retrieval-based and tool-based approaches. The paper highlights key advances in continual vertical, language, and temporal pretraining, alongside broad continual finetuning applications such as text classification, NER, relation extraction, MT, instruction tuning, and alignment, while discussing knowledge editing and alignment. It also discusses benchmarks, datasets, challenges (e.g., catastrophic forgetting and alignment tax), and future directions like multimodal lifelong learning and more efficient, scalable architectures. Overall, the work provides a comprehensive framework for enabling robust, adaptable, and up-to-date LLMs in dynamic real-world environments.

Abstract

As the applications of large language models (LLMs) expand across diverse fields, the ability of these models to adapt to ongoing changes in data, tasks, and user preferences becomes crucial. Traditional training methods, relying on static datasets, are increasingly inadequate for coping with the dynamic nature of real-world information. Lifelong learning, also known as continual or incremental learning, addresses this challenge by enabling LLMs to learn continuously and adaptively over their operational lifetime, integrating new knowledge while retaining previously learned information and preventing catastrophic forgetting. This survey delves into the sophisticated landscape of lifelong learning, categorizing strategies into two primary groups: Internal Knowledge and External Knowledge. Internal Knowledge includes continual pretraining and continual finetuning, each enhancing the adaptability of LLMs in various scenarios. External Knowledge encompasses retrieval-based and tool-based lifelong learning, leveraging external data sources and computational tools to extend the model's capabilities without modifying core parameters. The key contributions of our survey are: (1) Introducing a novel taxonomy categorizing the extensive literature of lifelong learning into 12 scenarios; (2) Identifying common techniques across all lifelong learning scenarios and classifying existing literature into various technique groups within each scenario; (3) Highlighting emerging techniques such as model expansion and data selection, which were less explored in the pre-LLM era. Through a detailed examination of these groups and their respective categories, this survey aims to enhance the adaptability, reliability, and overall performance of LLMs in real-world applications.

Towards Lifelong Learning of Large Language Models: A Survey

TL;DR

Abstract

Paper Structure (60 sections, 6 figures, 3 tables)

This paper contains 60 sections, 6 figures, 3 tables.

Introduction
Overview of Lifelong Learning
Problem Formulation
Evaluation Metrics
Common Techniques
Replay-based methods
Regularization-Based Methods
Architecture-Based Methods
Distillation-Based Methods
Benchmarks and Datasets
Methodology: Continual Pretraining
Continual Vertical Domain Pretraining
Parameter-Efficient Fine-Tuning
Model Expansion
Re-warming
...and 45 more sections

Figures (6)

Figure 1: An illustration of lifelong learning: humans can incrementally learn new skills such as walking, riding a bike, and driving a car. Similarly, lifelong learning aims to equip LLMs with new languages, domain knowledge, and information.
Figure 2: Taxonomy of lifelong learning methods for LLMs.
Figure 3: Four categories of common techniques for lifelong learning with LLMs.
Figure 4: Six categories of architecture-based lifelong methods for LLMs.
Figure 5: An illustration of continual finetuning scenarios. In each continual finetuning scenario, a model learns task $t-1$, $t$, and $t+1$ sequentially (left to right). The PURPLE and the GREEN boxes represent the input and the output respectively.
...and 1 more figures

Towards Lifelong Learning of Large Language Models: A Survey

TL;DR

Abstract

Towards Lifelong Learning of Large Language Models: A Survey

Authors

TL;DR

Abstract

Table of Contents

Figures (6)