Adaptive Adapter Routing for Long-Tailed Class-Incremental Learning
Zhi-Hong Qi, Da-Wei Zhou, Yiran Yao, Han-Jia Ye, De-Chuan Zhan
TL;DR
The paper tackles long-tailed class-incremental learning under exemplar-free constraints by leveraging pre-trained Vision Transformers with adaptive adapters. APART introduces two adapter pools (one auxiliary for minority classes) and learns instance-specific routing weights $w(\mathbf{x},y)$ to guide when to apply the auxiliary loss, enabling comprehensive representation across all classes. Training combines losses from the main and auxiliary pools, weighted by the adaptive router, and inference ensembles logits from both pools for improved robustness. Experiments on CIFAR100, ImageNet-R, and ObjectNet show state-of-the-art performance with favorable memory usage, and ablations confirm the necessity and effectiveness of the adaptive routing and auxiliary pool components.
Abstract
In our ever-evolving world, new data exhibits a long-tailed distribution, such as e-commerce platform reviews. This necessitates continuous model learning imbalanced data without forgetting, addressing the challenge of long-tailed class-incremental learning (LTCIL). Existing methods often rely on retraining linear classifiers with former data, which is impractical in real-world settings. In this paper, we harness the potent representation capabilities of pre-trained models and introduce AdaPtive Adapter RouTing (APART) as an exemplar-free solution for LTCIL. To counteract forgetting, we train inserted adapters with frozen pre-trained weights for deeper adaptation and maintain a pool of adapters for selection during sequential model updates. Additionally, we present an auxiliary adapter pool designed for effective generalization, especially on minority classes. Adaptive instance routing across these pools captures crucial correlations, facilitating a comprehensive representation of all classes. Consequently, APART tackles the imbalance problem as well as catastrophic forgetting in a unified framework. Extensive benchmark experiments validate the effectiveness of APART. Code is available at: https://github.com/vita-qzh/APART
