Table of Contents
Fetching ...

Knowledge Adaptation Network for Few-Shot Class-Incremental Learning

Ye Wang, Yaxiong Wang, Guoshuai Zhao, Xueming Qian

TL;DR

This work tackles the problem of few-shot class-incremental learning (FSCIL) by leveraging the CLIP foundation model as a general-purpose backbone and introducing a Knowledge Adapter (KA) to inject data-specific, task-relevant knowledge via a Knowledge Vector Library ($\mathcal{M}$) and a query-based fusion mechanism. To bridge the gap between base and incremental sessions under data scarcity, the authors propose Incremental Pseudo Episode Learning (IPEL), which constructs pseudo incremental tasks from the base data to tune KA for FSCIL. The proposed Knowledge Adaptation Network (KANet) achieves state-of-the-art performance on CIFAR100, CUB200, and ImageNet-R across multiple backbones, with robust improvements to both accuracy and forgetting (PD). Overall, the approach demonstrates that combining a foundation-model backbone with targeted knowledge fusion and pseudo-task adaptation yields strong, practical gains for FSCIL in realistic data-scarce scenarios.

Abstract

Few-shot class-incremental learning (FSCIL) aims to incrementally recognize new classes using a few samples while maintaining the performance on previously learned classes. One of the effective methods to solve this challenge is to construct prototypical evolution classifiers. Despite the advancement achieved by most existing methods, the classifier weights are simply initialized using mean features. Because representations for new classes are weak and biased, we argue such a strategy is suboptimal. In this paper, we tackle this issue from two aspects. Firstly, thanks to the development of foundation models, we employ a foundation model, the CLIP, as the network pedestal to provide a general representation for each class. Secondly, to generate a more reliable and comprehensive instance representation, we propose a Knowledge Adapter (KA) module that summarizes the data-specific knowledge from training data and fuses it into the general representation. Additionally, to tune the knowledge learned from the base classes to the upcoming classes, we propose a mechanism of Incremental Pseudo Episode Learning (IPEL) by simulating the actual FSCIL. Taken together, our proposed method, dubbed as Knowledge Adaptation Network (KANet), achieves competitive performance on a wide range of datasets, including CIFAR100, CUB200, and ImageNet-R.

Knowledge Adaptation Network for Few-Shot Class-Incremental Learning

TL;DR

This work tackles the problem of few-shot class-incremental learning (FSCIL) by leveraging the CLIP foundation model as a general-purpose backbone and introducing a Knowledge Adapter (KA) to inject data-specific, task-relevant knowledge via a Knowledge Vector Library () and a query-based fusion mechanism. To bridge the gap between base and incremental sessions under data scarcity, the authors propose Incremental Pseudo Episode Learning (IPEL), which constructs pseudo incremental tasks from the base data to tune KA for FSCIL. The proposed Knowledge Adaptation Network (KANet) achieves state-of-the-art performance on CIFAR100, CUB200, and ImageNet-R across multiple backbones, with robust improvements to both accuracy and forgetting (PD). Overall, the approach demonstrates that combining a foundation-model backbone with targeted knowledge fusion and pseudo-task adaptation yields strong, practical gains for FSCIL in realistic data-scarce scenarios.

Abstract

Few-shot class-incremental learning (FSCIL) aims to incrementally recognize new classes using a few samples while maintaining the performance on previously learned classes. One of the effective methods to solve this challenge is to construct prototypical evolution classifiers. Despite the advancement achieved by most existing methods, the classifier weights are simply initialized using mean features. Because representations for new classes are weak and biased, we argue such a strategy is suboptimal. In this paper, we tackle this issue from two aspects. Firstly, thanks to the development of foundation models, we employ a foundation model, the CLIP, as the network pedestal to provide a general representation for each class. Secondly, to generate a more reliable and comprehensive instance representation, we propose a Knowledge Adapter (KA) module that summarizes the data-specific knowledge from training data and fuses it into the general representation. Additionally, to tune the knowledge learned from the base classes to the upcoming classes, we propose a mechanism of Incremental Pseudo Episode Learning (IPEL) by simulating the actual FSCIL. Taken together, our proposed method, dubbed as Knowledge Adaptation Network (KANet), achieves competitive performance on a wide range of datasets, including CIFAR100, CUB200, and ImageNet-R.
Paper Structure (23 sections, 6 equations, 6 figures, 8 tables)

This paper contains 23 sections, 6 equations, 6 figures, 8 tables.

Figures (6)

  • Figure 1: Illustration of our motivation. Our proposed method borrows the treasure from Retrieval Augmented Generation (RAG) technique to summarize the data-specific knowledge into a Knowledge Vector Library (KVL) and use them to refine the model's output.
  • Figure 2: Our proposed method adopts the pretrained image branch of CLIP as the backbone, where the knowledge adapter (KA) is plugged into one of the encoding layers and integrates data-specific knowledge stored in the knowledge vector library $\mathcal{M}$ and general knowledge of the CLIP using query-based knowledge fusion (QKF) to enhance instance representation. While the incremental pseudo episode learning (IPEL) scheme simulates real-world incremental settings and trains the KA via pseudo task adaption and balance learning, where $\mathcal{M}^{\text{pn}}$ refers to the pseudo new knowledge extracted from the support set $S$ and $\mathcal{M}^{\text{pg}}$ indicates the pseudo global knowledge constructed by $\mathcal{M}^{\text{po}}$ and $\mathcal{M}^{\text{pn}}$.
  • Figure 3: t-SNE visualization on the resulting feature spaces generated by (a) removing and (b) using knowledge adapter (KA).
  • Figure 4: Analysis of incremental pseudo episode learning under different conditions on CIFAR100.
  • Figure 5: Visualization of attention weights between different classes and the data-specific knowledge summarized in the knowledge vector library, where 5 base classes and 5 incremental classes are selected.
  • ...and 1 more figures