Table of Contents
Fetching ...

Continual Learning with Pre-Trained Models: A Survey

Da-Wei Zhou, Hai-Long Sun, Jingyi Ning, Han-Jia Ye, De-Chuan Zhan

TL;DR

This survey analyzes how pre-trained models (PTMs), especially Vision Transformers, transform continual learning (CL) by reducing forgetting and enabling efficient adaptation. It introduces a three-way taxonomy—prompt-based, representation-based, and model mixture-based CL—and empirically compares representative methods across seven benchmarks, highlighting fairness concerns in cross-method evaluations. The findings indicate that representation-based approaches with PTM representations often match or exceed prompting methods, and that simple prototype-based strategies can be surprisingly strong baselines. The paper also outlines forward-looking directions, including lifelong editing of large language models, multi-modal PTMs, resource-efficient CL, and new benchmarks with larger domain gaps.

Abstract

Nowadays, real-world applications often face streaming data, which requires the learning system to absorb new knowledge as data evolves. Continual Learning (CL) aims to achieve this goal and meanwhile overcome the catastrophic forgetting of former knowledge when learning new ones. Typical CL methods build the model from scratch to grow with incoming data. However, the advent of the pre-trained model (PTM) era has sparked immense research interest, particularly in leveraging PTMs' robust representational capabilities. This paper presents a comprehensive survey of the latest advancements in PTM-based CL. We categorize existing methodologies into three distinct groups, providing a comparative analysis of their similarities, differences, and respective advantages and disadvantages. Additionally, we offer an empirical study contrasting various state-of-the-art methods to highlight concerns regarding fairness in comparisons. The source code to reproduce these evaluations is available at: https://github.com/sun-hailong/LAMDA-PILOT

Continual Learning with Pre-Trained Models: A Survey

TL;DR

This survey analyzes how pre-trained models (PTMs), especially Vision Transformers, transform continual learning (CL) by reducing forgetting and enabling efficient adaptation. It introduces a three-way taxonomy—prompt-based, representation-based, and model mixture-based CL—and empirically compares representative methods across seven benchmarks, highlighting fairness concerns in cross-method evaluations. The findings indicate that representation-based approaches with PTM representations often match or exceed prompting methods, and that simple prototype-based strategies can be surprisingly strong baselines. The paper also outlines forward-looking directions, including lifelong editing of large language models, multi-modal PTMs, resource-efficient CL, and new benchmarks with larger domain gaps.

Abstract

Nowadays, real-world applications often face streaming data, which requires the learning system to absorb new knowledge as data evolves. Continual Learning (CL) aims to achieve this goal and meanwhile overcome the catastrophic forgetting of former knowledge when learning new ones. Typical CL methods build the model from scratch to grow with incoming data. However, the advent of the pre-trained model (PTM) era has sparked immense research interest, particularly in leveraging PTMs' robust representational capabilities. This paper presents a comprehensive survey of the latest advancements in PTM-based CL. We categorize existing methodologies into three distinct groups, providing a comparative analysis of their similarities, differences, and respective advantages and disadvantages. Additionally, we offer an empirical study contrasting various state-of-the-art methods to highlight concerns regarding fairness in comparisons. The source code to reproduce these evaluations is available at: https://github.com/sun-hailong/LAMDA-PILOT
Paper Structure (12 sections, 9 equations, 3 figures, 1 table)

This paper contains 12 sections, 9 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Illustrations of CL and its variations. New tasks arrive sequentially, and the model needs to learn them incrementally. After the learning process of each task, the model will be evaluated among all seen tasks. Traditional CL methods utilize randomly initialized weights as model initialization, while PTM-based methods make use of substantial and informative data to pre-train the CL model.
  • Figure 2: Taxonomy of PTM-based CL. We classify them into three subcategories, i.e., prompt-based ($\blacksquare$), representation-based ($\blacksquare$), and model mixture-based ($\blacksquare$). Different colors indicate different categories, and we list representative works in the boxes.
  • Figure 3: Different kinds of prompt selection, including key-value matching, shared and task-specific retrieval, attention-based combination, and instance-specific prompt generation.