Dual Prototypes for Adaptive Pre-Trained Model in Class-Incremental Learning
Zhiming Xu, Suorong Yang, Baile Xu, Furao Shen, Jian Zhao
TL;DR
The paper tackles catastrophic forgetting in class-incremental learning by freezing pre-trained transformers and introducing a per-task adapter system trained with a Center-Adapt loss. It adds a dual-prototype classifier that uses raw prototypes for reliable top-K candidate labels and augmented prototypes for final refinement, enabling test-time adapter selection without exhaustively loading all adapters. Empirical results show state-of-the-art or competitive performance across diverse benchmarks, with notable gains on VTAB and strong exemplar-free performance, while analyses reveal limitations on low-resolution data and latency trade-offs. The approach offers a flexible, plug-and-play framework for PTM-based CIL with efficient storage and inference characteristics.
Abstract
Class-incremental learning (CIL) aims to learn new classes while retaining previous knowledge. Although pre-trained model (PTM) based approaches show strong performance, directly fine-tuning PTMs on incremental task streams often causes renewed catastrophic forgetting. This paper proposes a Dual-Prototype Network with Task-wise Adaptation (DPTA) for PTM-based CIL. For each incremental learning task, an adapter module is built to fine-tune the PTM, where the center-adapt loss forces the representation to be more centrally clustered and class separable. The dual prototype network improves the prediction process by enabling test-time adapter selection, where the raw prototypes deduce several possible task indexes of test samples to select suitable adapter modules for PTM, and the augmented prototypes that could separate confusable classes are utilized to determine the final result. Experiments on multiple benchmarks show that DPTA consistently surpasses recent methods by 1\% - 5\%. Notably, on the VTAB dataset, it achieves approximately 3\% improvement over state-of-the-art methods. The code is open-sourced in https://github.com/Yorkxzm/DPTA}
