Table of Contents
Fetching ...

Towards Lifelong Learning of Large Language Models: A Survey

Junhao Zheng, Shengjie Qiu, Chengming Shi, Qianli Ma

TL;DR

This survey addresses lifelong learning for large language models by introducing a two-pronged taxonomy of Internal Knowledge and External Knowledge, and by organizing the literature into 12 distinct learning scenarios. It synthesizes core technique families—replay, regularization, architecture, and distillation—across continual pretraining and continual finetuning, and extends to external knowledge via retrieval-based and tool-based approaches. The paper highlights key advances in continual vertical, language, and temporal pretraining, alongside broad continual finetuning applications such as text classification, NER, relation extraction, MT, instruction tuning, and alignment, while discussing knowledge editing and alignment. It also discusses benchmarks, datasets, challenges (e.g., catastrophic forgetting and alignment tax), and future directions like multimodal lifelong learning and more efficient, scalable architectures. Overall, the work provides a comprehensive framework for enabling robust, adaptable, and up-to-date LLMs in dynamic real-world environments.

Abstract

As the applications of large language models (LLMs) expand across diverse fields, the ability of these models to adapt to ongoing changes in data, tasks, and user preferences becomes crucial. Traditional training methods, relying on static datasets, are increasingly inadequate for coping with the dynamic nature of real-world information. Lifelong learning, also known as continual or incremental learning, addresses this challenge by enabling LLMs to learn continuously and adaptively over their operational lifetime, integrating new knowledge while retaining previously learned information and preventing catastrophic forgetting. This survey delves into the sophisticated landscape of lifelong learning, categorizing strategies into two primary groups: Internal Knowledge and External Knowledge. Internal Knowledge includes continual pretraining and continual finetuning, each enhancing the adaptability of LLMs in various scenarios. External Knowledge encompasses retrieval-based and tool-based lifelong learning, leveraging external data sources and computational tools to extend the model's capabilities without modifying core parameters. The key contributions of our survey are: (1) Introducing a novel taxonomy categorizing the extensive literature of lifelong learning into 12 scenarios; (2) Identifying common techniques across all lifelong learning scenarios and classifying existing literature into various technique groups within each scenario; (3) Highlighting emerging techniques such as model expansion and data selection, which were less explored in the pre-LLM era. Through a detailed examination of these groups and their respective categories, this survey aims to enhance the adaptability, reliability, and overall performance of LLMs in real-world applications.

Towards Lifelong Learning of Large Language Models: A Survey

TL;DR

This survey addresses lifelong learning for large language models by introducing a two-pronged taxonomy of Internal Knowledge and External Knowledge, and by organizing the literature into 12 distinct learning scenarios. It synthesizes core technique families—replay, regularization, architecture, and distillation—across continual pretraining and continual finetuning, and extends to external knowledge via retrieval-based and tool-based approaches. The paper highlights key advances in continual vertical, language, and temporal pretraining, alongside broad continual finetuning applications such as text classification, NER, relation extraction, MT, instruction tuning, and alignment, while discussing knowledge editing and alignment. It also discusses benchmarks, datasets, challenges (e.g., catastrophic forgetting and alignment tax), and future directions like multimodal lifelong learning and more efficient, scalable architectures. Overall, the work provides a comprehensive framework for enabling robust, adaptable, and up-to-date LLMs in dynamic real-world environments.

Abstract

As the applications of large language models (LLMs) expand across diverse fields, the ability of these models to adapt to ongoing changes in data, tasks, and user preferences becomes crucial. Traditional training methods, relying on static datasets, are increasingly inadequate for coping with the dynamic nature of real-world information. Lifelong learning, also known as continual or incremental learning, addresses this challenge by enabling LLMs to learn continuously and adaptively over their operational lifetime, integrating new knowledge while retaining previously learned information and preventing catastrophic forgetting. This survey delves into the sophisticated landscape of lifelong learning, categorizing strategies into two primary groups: Internal Knowledge and External Knowledge. Internal Knowledge includes continual pretraining and continual finetuning, each enhancing the adaptability of LLMs in various scenarios. External Knowledge encompasses retrieval-based and tool-based lifelong learning, leveraging external data sources and computational tools to extend the model's capabilities without modifying core parameters. The key contributions of our survey are: (1) Introducing a novel taxonomy categorizing the extensive literature of lifelong learning into 12 scenarios; (2) Identifying common techniques across all lifelong learning scenarios and classifying existing literature into various technique groups within each scenario; (3) Highlighting emerging techniques such as model expansion and data selection, which were less explored in the pre-LLM era. Through a detailed examination of these groups and their respective categories, this survey aims to enhance the adaptability, reliability, and overall performance of LLMs in real-world applications.
Paper Structure (60 sections, 6 figures, 3 tables)

This paper contains 60 sections, 6 figures, 3 tables.

Figures (6)

  • Figure 1: An illustration of lifelong learning: humans can incrementally learn new skills such as walking, riding a bike, and driving a car. Similarly, lifelong learning aims to equip LLMs with new languages, domain knowledge, and information.
  • Figure 2: Taxonomy of lifelong learning methods for LLMs.
  • Figure 3: Four categories of common techniques for lifelong learning with LLMs.
  • Figure 4: Six categories of architecture-based lifelong methods for LLMs.
  • Figure 5: An illustration of continual finetuning scenarios. In each continual finetuning scenario, a model learns task $t-1$, $t$, and $t+1$ sequentially (left to right). The PURPLE and the GREEN boxes represent the input and the output respectively.
  • ...and 1 more figures