Table of Contents
Fetching ...

Rethinking Few-shot Class-incremental Learning: Learning from Yourself

Yu-Ming Tang, Yi-Xing Peng, Jingke Meng, Wei-Shi Zheng

TL;DR

This work addresses the evaluation bias in FSCIL by introducing generalized average accuracy ($gAcc$), a parameterized metric that balances base and novel-class performance and is summarized via the area under its curve (AUC) across $\alpha$. It also proposes a ViT-based framework with a lightweight Feature Rectification (FR) module that leverages intermediate-layer representations through two relation-transfer losses (IR and CR) and multi-layer knowledge ensembles to improve novel-class generalization. The approach yields strong results across three FSCIL benchmarks (CIFAR-100, miniImageNet, CUB-200), with notable gains in $gAcc$ while maintaining competitive $aAcc$, and is supported by extensive ablations and corner-case analyses. The work provides a practical framework and evaluation toolkit for balancing base-novel performance in continual learning, with publicly available code to foster reproducibility and comparability.

Abstract

Few-shot class-incremental learning (FSCIL) aims to learn sequential classes with limited samples in a few-shot fashion. Inherited from the classical class-incremental learning setting, the popular benchmark of FSCIL uses averaged accuracy (aAcc) and last-task averaged accuracy (lAcc) as the evaluation metrics. However, we reveal that such evaluation metrics may not provide adequate emphasis on the novel class performance, and the continual learning ability of FSCIL methods could be ignored under this benchmark. In this work, as a complement to existing metrics, we offer a new metric called generalized average accuracy (gAcc) which is designed to provide an extra equitable evaluation by incorporating different perspectives of the performance under the guidance of a parameter $α$. We also present an overall metric in the form of the area under the curve (AUC) along the $α$. Under the guidance of gAcc, we release the potential of intermediate features of the vision transformers to boost the novel-class performance. Taking information from intermediate layers which are less class-specific and more generalizable, we manage to rectify the final features, leading to a more generalizable transformer-based FSCIL framework. Without complex network designs or cumbersome training procedures, our method outperforms existing FSCIL methods at aAcc and gAcc on three datasets. See codes at https://github.com/iSEE-Laboratory/Revisting_FSCIL

Rethinking Few-shot Class-incremental Learning: Learning from Yourself

TL;DR

This work addresses the evaluation bias in FSCIL by introducing generalized average accuracy (), a parameterized metric that balances base and novel-class performance and is summarized via the area under its curve (AUC) across . It also proposes a ViT-based framework with a lightweight Feature Rectification (FR) module that leverages intermediate-layer representations through two relation-transfer losses (IR and CR) and multi-layer knowledge ensembles to improve novel-class generalization. The approach yields strong results across three FSCIL benchmarks (CIFAR-100, miniImageNet, CUB-200), with notable gains in while maintaining competitive , and is supported by extensive ablations and corner-case analyses. The work provides a practical framework and evaluation toolkit for balancing base-novel performance in continual learning, with publicly available code to foster reproducibility and comparability.

Abstract

Few-shot class-incremental learning (FSCIL) aims to learn sequential classes with limited samples in a few-shot fashion. Inherited from the classical class-incremental learning setting, the popular benchmark of FSCIL uses averaged accuracy (aAcc) and last-task averaged accuracy (lAcc) as the evaluation metrics. However, we reveal that such evaluation metrics may not provide adequate emphasis on the novel class performance, and the continual learning ability of FSCIL methods could be ignored under this benchmark. In this work, as a complement to existing metrics, we offer a new metric called generalized average accuracy (gAcc) which is designed to provide an extra equitable evaluation by incorporating different perspectives of the performance under the guidance of a parameter . We also present an overall metric in the form of the area under the curve (AUC) along the . Under the guidance of gAcc, we release the potential of intermediate features of the vision transformers to boost the novel-class performance. Taking information from intermediate layers which are less class-specific and more generalizable, we manage to rectify the final features, leading to a more generalizable transformer-based FSCIL framework. Without complex network designs or cumbersome training procedures, our method outperforms existing FSCIL methods at aAcc and gAcc on three datasets. See codes at https://github.com/iSEE-Laboratory/Revisting_FSCIL
Paper Structure (25 sections, 14 equations, 12 figures, 14 tables)

This paper contains 25 sections, 14 equations, 12 figures, 14 tables.

Figures (12)

  • Figure 1: Performance under different metrics of two recent methods subnetworks3c. (a): The dotted line denotes the average accuracy (aAcc) widely used in classical class-incremental learning. The solid line presents our proposed generalized accuracy (gAcc). We also show the average of each point on the right side. (Blue: $gAcc$, Red: $aAcc$). (b): The detailed accuracies of each task after training on the last task (task 8). It is evident that S3C exhibits superior performance when confronted with novel classes. The conventional metric $aAcc$ fails to reflect this due to the domination by the base-class performance. Models are trained on the CIFAR-100 dataset. Best viewed in color.
  • Figure 2: FSCIL performances are shown in generalized accuracy. (a): The $gAcc$ curve (averaged across all $n_t$ tasks) v.s. param $\alpha$. (b): The $gAcc$ AUC of each task $\mathcal{T}_i$ (\ref{['eq:gacci']}). In the legend, we show the average AUC across tasks(\ref{['eq:gacc']}) of each method. We evaluate recent works including SAVCsavc, NCneural_collapse, and Alicealice on the miniImageNet dataset. Some corner cases: Lazy: The model maintains base-task performance while refusing to learn anything from novel tasks. Greedy: Regardless of the previous knowledge, the model only greedily focuses on the current task. Greedy-NF: A Non-Forget version of 'Greedy'. See more about these cases in our supplementary material.
  • Figure 3: t-SNE tsne visualization of miniImageNet test set features from different layers of a ViT. Left: we choose layers $L_9$, $L_{10}$, $L_{11}$, $L_{12}$(final) and show both base classes and novel classes features. It is observed that shallow features are more dispersed than deep features. Right: Accuracies of different tasks on features from various layers. It is clear that intermediate features achieve better performance on novel classes and thus have better generalization ability. Best viewed in color.
  • Figure 4: The overall framework of our method. The Feature Rectification (FR) module takes the final and the intermediate features as input and outputs the rectified features for classification. The relation transfer losses are designed to convey valuable information from intermediate features to the rectified ones. The details of these two parts are shown in the left and lower parts of the figure. Further, a cosine constraint is proposed to maintain the base class performance and a classification loss is employed to adapt to the novel classes. Best viewed in color.
  • Figure 4: Ablation studies on CIFAR-100.
  • ...and 7 more figures