Table of Contents
Fetching ...

Adaptive Weighted Parameter Fusion with CLIP for Class-Incremental Learning

Juncen Guo, Xiaoguang Zhu, Liangyu Teng, Hao Yang, Jing Liu, Yang Liu, Liang Song

TL;DR

This work tackles catastrophic forgetting in class-incremental learning by leveraging a frozen CLIP backbone augmented with an adaptive parameter module. It introduces a stacking-based low-rank parameter fusion to integrate task-specific knowledge, controlled by a dynamic balance factor derived from MMD and LDA to balance distribution alignment and class separability. Empirical results on CIFAR100 and ImageNet100 show state-of-the-art performance, with strong stability-plasticity trade-offs and robust handling of distribution shifts across tasks. The approach preserves effective information in the parameter matrices while enabling efficient adaptation to new classes, offering practical benefits for multimodal, open-world incremental learning scenarios.

Abstract

Class-incremental Learning (CIL) enables the model to incrementally absorb knowledge from new classes and build a generic classifier across all previously encountered classes. When the model optimizes with new classes, the knowledge of previous classes is inevitably erased, leading to catastrophic forgetting. Addressing this challenge requires making a trade-off between retaining old knowledge and accommodating new information. However, this balancing process often requires sacrificing some information, which can lead to a partial loss in the model's ability to discriminate between classes. To tackle this issue, we design the adaptive weighted parameter fusion with Contrastive Language-Image Pre-training (CLIP), which not only takes into account the variability of the data distribution of different tasks, but also retains all the effective information of the parameter matrix to the greatest extent. In addition, we introduce a balance factor that can balance the data distribution alignment and distinguishability of adjacent tasks. Experimental results on several traditional benchmarks validate the superiority of the proposed method.

Adaptive Weighted Parameter Fusion with CLIP for Class-Incremental Learning

TL;DR

This work tackles catastrophic forgetting in class-incremental learning by leveraging a frozen CLIP backbone augmented with an adaptive parameter module. It introduces a stacking-based low-rank parameter fusion to integrate task-specific knowledge, controlled by a dynamic balance factor derived from MMD and LDA to balance distribution alignment and class separability. Empirical results on CIFAR100 and ImageNet100 show state-of-the-art performance, with strong stability-plasticity trade-offs and robust handling of distribution shifts across tasks. The approach preserves effective information in the parameter matrices while enabling efficient adaptation to new classes, offering practical benefits for multimodal, open-world incremental learning scenarios.

Abstract

Class-incremental Learning (CIL) enables the model to incrementally absorb knowledge from new classes and build a generic classifier across all previously encountered classes. When the model optimizes with new classes, the knowledge of previous classes is inevitably erased, leading to catastrophic forgetting. Addressing this challenge requires making a trade-off between retaining old knowledge and accommodating new information. However, this balancing process often requires sacrificing some information, which can lead to a partial loss in the model's ability to discriminate between classes. To tackle this issue, we design the adaptive weighted parameter fusion with Contrastive Language-Image Pre-training (CLIP), which not only takes into account the variability of the data distribution of different tasks, but also retains all the effective information of the parameter matrix to the greatest extent. In addition, we introduce a balance factor that can balance the data distribution alignment and distinguishability of adjacent tasks. Experimental results on several traditional benchmarks validate the superiority of the proposed method.

Paper Structure

This paper contains 20 sections, 5 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: The framework of adaptive weighted parameter fusion with CLIP.
  • Figure 2: The process of parameter fusion method based on stacking.
  • Figure 3: The T-SNE visualization of (a) CLIP with CIFAR100, (b) CLIP with ImageNet100, (c) Our method with CIFAR100, (d) Our method with ImageNet100 on B0 Inc10 after the second task
  • Figure 4: Sensitivity analysis on $\lambda$ with ImageNet100 B0 Inc10.