Table of Contents
Fetching ...

Few-Shot Class-Incremental Learning with Prior Knowledge

Wenhao Jiang, Duo Li, Menghan Hu, Guangtao Zhai, Xiaokang Yang, Xiao-Ping Zhang

TL;DR

The paper tackles catastrophic forgetting and overfitting in FSCIL by introducing Learning with Prior Knowledge (LwPK), which pre-trains a model with unlabeled data from prospective classes using deep clustering to produce pseudo-labels and jointly train with base data. This forward-prior strategy creates a hybrid embedding space and reduces model updates during incremental phases, as supported by empirical risk minimization and class-distance analysis. Across CIFAR100, CUB200, and miniImageNet, LwPK demonstrates superior or competitive performance relative to strong FSCIL baselines and semi-supervised rivals, highlighting the value of leveraging unlabeled data as priors. The work includes extensive ablations and theoretical justification, and releases its code to facilitate replication and future exploration of forward priors in FSCIL.

Abstract

To tackle the issues of catastrophic forgetting and overfitting in few-shot class-incremental learning (FSCIL), previous work has primarily concentrated on preserving the memory of old knowledge during the incremental phase. The role of pre-trained model in shaping the effectiveness of incremental learning is frequently underestimated in these studies. Therefore, to enhance the generalization ability of the pre-trained model, we propose Learning with Prior Knowledge (LwPK) by introducing nearly free prior knowledge from a few unlabeled data of subsequent incremental classes. We cluster unlabeled incremental class samples to produce pseudo-labels, then jointly train these with labeled base class samples, effectively allocating embedding space for both old and new class data. Experimental results indicate that LwPK effectively enhances the model resilience against catastrophic forgetting, with theoretical analysis based on empirical risk minimization and class distance measurement corroborating its operational principles. The source code of LwPK is publicly available at: \url{https://github.com/StevenJ308/LwPK}.

Few-Shot Class-Incremental Learning with Prior Knowledge

TL;DR

The paper tackles catastrophic forgetting and overfitting in FSCIL by introducing Learning with Prior Knowledge (LwPK), which pre-trains a model with unlabeled data from prospective classes using deep clustering to produce pseudo-labels and jointly train with base data. This forward-prior strategy creates a hybrid embedding space and reduces model updates during incremental phases, as supported by empirical risk minimization and class-distance analysis. Across CIFAR100, CUB200, and miniImageNet, LwPK demonstrates superior or competitive performance relative to strong FSCIL baselines and semi-supervised rivals, highlighting the value of leveraging unlabeled data as priors. The work includes extensive ablations and theoretical justification, and releases its code to facilitate replication and future exploration of forward priors in FSCIL.

Abstract

To tackle the issues of catastrophic forgetting and overfitting in few-shot class-incremental learning (FSCIL), previous work has primarily concentrated on preserving the memory of old knowledge during the incremental phase. The role of pre-trained model in shaping the effectiveness of incremental learning is frequently underestimated in these studies. Therefore, to enhance the generalization ability of the pre-trained model, we propose Learning with Prior Knowledge (LwPK) by introducing nearly free prior knowledge from a few unlabeled data of subsequent incremental classes. We cluster unlabeled incremental class samples to produce pseudo-labels, then jointly train these with labeled base class samples, effectively allocating embedding space for both old and new class data. Experimental results indicate that LwPK effectively enhances the model resilience against catastrophic forgetting, with theoretical analysis based on empirical risk minimization and class distance measurement corroborating its operational principles. The source code of LwPK is publicly available at: \url{https://github.com/StevenJ308/LwPK}.
Paper Structure (19 sections, 19 equations, 5 figures, 9 tables, 1 algorithm)

This paper contains 19 sections, 19 equations, 5 figures, 9 tables, 1 algorithm.

Figures (5)

  • Figure 1: Illustration of the operations we take to build a model with prior knowledge: a) the training data consists of labeled base class data and unlabeled incremental class data, and we provide supervised signals to the unlabeled data by unsupervised methods. b) the embedding space occupied by existing class data and reserved for new class data, thereby demonstrating the influence of the incremental class data on the construction of a pre-trained model.
  • Figure 2: Overall Framework of LwPK. a) Pipeline of LwPK. In scenarios where the volume of labeled data surpasses that of unlabeled data. b) Detailed description of the clustering module. $p_i$ denotes the $i$-th image in the unlabeled dataset, $v_i$ denotes the $i$-th dimension of the feature. $\mathcal{L}_1$ and $\mathcal{L}_2$ are the two loss functions used to representation learning.
  • Figure 3: Test accuracy of each session on three datasets.
  • Figure 4: t-SNE visualization plot, 5 base classes & 5 new classes on CUB200.
  • Figure 5: Impact of the reconciliation coefficient $\omega$. "w/" and "w/o" denote with and without $\omega$, respectively.