Table of Contents
Fetching ...

A Bag of Tricks for Few-Shot Class-Incremental Learning

Shuvendu Roy, Chunjong Park, Aldi Fahrezi, Ali Etemad

TL;DR

This work addresses FSCIL, a challenging continual learning setting with limited samples for new classes, by proposing a bag-of-tricks framework that unifies six techniques across stability, adaptability, and training. The approach combines SupCon-based stability, ETF prototype pre-assignment, pseudo-classes, Incremental SubNet Tuning, self-supervised pre-training, and a rotation pretext task within a baseline incremental-frozen structure. Empirically, it achieves state-of-the-art results on CIFAR-100, CUB-200, and miniImageNet, with notable improvements in stability (reduced forgetting) and adaptability (better novel-class learning), and demonstrates scalability to larger encoders and ImageNet-1K. The work provides a practical baseline and a comprehensive analysis of how these tricks interact to balance the stability–adaptability trade-off in FSCIL, with broader implications for data-scarce continual learning scenarios.

Abstract

We present a bag of tricks framework for few-shot class-incremental learning (FSCIL), which is a challenging form of continual learning that involves continuous adaptation to new tasks with limited samples. FSCIL requires both stability and adaptability, i.e., preserving proficiency in previously learned tasks while learning new ones. Our proposed bag of tricks brings together six key and highly influential techniques that improve stability, adaptability, and overall performance under a unified framework for FSCIL. We organize these tricks into three categories: stability tricks, adaptability tricks, and training tricks. Stability tricks aim to mitigate the forgetting of previously learned classes by enhancing the separation between the embeddings of learned classes and minimizing interference when learning new ones. On the other hand, adaptability tricks focus on the effective learning of new classes. Finally, training tricks improve the overall performance without compromising stability or adaptability. We perform extensive experiments on three benchmark datasets, CIFAR-100, CUB-200, and miniIMageNet, to evaluate the impact of our proposed framework. Our detailed analysis shows that our approach substantially improves both stability and adaptability, establishing a new state-of-the-art by outperforming prior works in the area. We believe our method provides a go-to solution and establishes a robust baseline for future research in this area.

A Bag of Tricks for Few-Shot Class-Incremental Learning

TL;DR

This work addresses FSCIL, a challenging continual learning setting with limited samples for new classes, by proposing a bag-of-tricks framework that unifies six techniques across stability, adaptability, and training. The approach combines SupCon-based stability, ETF prototype pre-assignment, pseudo-classes, Incremental SubNet Tuning, self-supervised pre-training, and a rotation pretext task within a baseline incremental-frozen structure. Empirically, it achieves state-of-the-art results on CIFAR-100, CUB-200, and miniImageNet, with notable improvements in stability (reduced forgetting) and adaptability (better novel-class learning), and demonstrates scalability to larger encoders and ImageNet-1K. The work provides a practical baseline and a comprehensive analysis of how these tricks interact to balance the stability–adaptability trade-off in FSCIL, with broader implications for data-scarce continual learning scenarios.

Abstract

We present a bag of tricks framework for few-shot class-incremental learning (FSCIL), which is a challenging form of continual learning that involves continuous adaptation to new tasks with limited samples. FSCIL requires both stability and adaptability, i.e., preserving proficiency in previously learned tasks while learning new ones. Our proposed bag of tricks brings together six key and highly influential techniques that improve stability, adaptability, and overall performance under a unified framework for FSCIL. We organize these tricks into three categories: stability tricks, adaptability tricks, and training tricks. Stability tricks aim to mitigate the forgetting of previously learned classes by enhancing the separation between the embeddings of learned classes and minimizing interference when learning new ones. On the other hand, adaptability tricks focus on the effective learning of new classes. Finally, training tricks improve the overall performance without compromising stability or adaptability. We perform extensive experiments on three benchmark datasets, CIFAR-100, CUB-200, and miniIMageNet, to evaluate the impact of our proposed framework. Our detailed analysis shows that our approach substantially improves both stability and adaptability, establishing a new state-of-the-art by outperforming prior works in the area. We believe our method provides a go-to solution and establishes a robust baseline for future research in this area.
Paper Structure (27 sections, 9 equations, 8 figures, 14 tables)

This paper contains 27 sections, 9 equations, 8 figures, 14 tables.

Figures (8)

  • Figure 1: The intuition behind stability tricks. Better separation of base classes ensures stability in incremental learning.
  • Figure 2: Properties of stability tricks on CIFAR-100. (a) Presents inter-class distance (the distance between class prototypes), which we aim to maximize for better stability during incremental training; (b) depicts the intra-class distance (the average distance of samples from the corresponding prototypes), which we aim to minimize for better stability; (c) presents the class separation degree (the overall separability of classes ranging between 0 and 1), which we aim to maximize; (d) presents the accuracy of Base, Novel, and Total classes.
  • Figure 3: Properties of adaptability tricks on CIFAR-100. (a) Presents accuracy of Base, Novel, and Total classes; (b) presents the total accuracies after each session, which we aim to maximize; (c) and (d) depict t-SNE visualizations for stability and adaptability tricks, where incorporating adaptability tricks shows more separation. Here, 0-4 are base classes, and 5-6 are novel classes.
  • Figure 4: Confusion matrices for the baseline and our bag of tricks. The baseline performs well on the base session, but performance drops for novel classes. Our framework shows improved performance for both base and novel classes.
  • Figure 5: Comparison to prior works across CIFAR-100, CUB-200, and miniImageNet datasets, demonstrating that our solution outperforms prior works.
  • ...and 3 more figures