Adaptive Weighted Parameter Fusion with CLIP for Class-Incremental Learning
Juncen Guo, Xiaoguang Zhu, Liangyu Teng, Hao Yang, Jing Liu, Yang Liu, Liang Song
TL;DR
This work tackles catastrophic forgetting in class-incremental learning by leveraging a frozen CLIP backbone augmented with an adaptive parameter module. It introduces a stacking-based low-rank parameter fusion to integrate task-specific knowledge, controlled by a dynamic balance factor derived from MMD and LDA to balance distribution alignment and class separability. Empirical results on CIFAR100 and ImageNet100 show state-of-the-art performance, with strong stability-plasticity trade-offs and robust handling of distribution shifts across tasks. The approach preserves effective information in the parameter matrices while enabling efficient adaptation to new classes, offering practical benefits for multimodal, open-world incremental learning scenarios.
Abstract
Class-incremental Learning (CIL) enables the model to incrementally absorb knowledge from new classes and build a generic classifier across all previously encountered classes. When the model optimizes with new classes, the knowledge of previous classes is inevitably erased, leading to catastrophic forgetting. Addressing this challenge requires making a trade-off between retaining old knowledge and accommodating new information. However, this balancing process often requires sacrificing some information, which can lead to a partial loss in the model's ability to discriminate between classes. To tackle this issue, we design the adaptive weighted parameter fusion with Contrastive Language-Image Pre-training (CLIP), which not only takes into account the variability of the data distribution of different tasks, but also retains all the effective information of the parameter matrix to the greatest extent. In addition, we introduce a balance factor that can balance the data distribution alignment and distinguishability of adjacent tasks. Experimental results on several traditional benchmarks validate the superiority of the proposed method.
