Table of Contents
Fetching ...

Controllable Forgetting Mechanism for Few-Shot Class-Incremental Learning

Kirill Paramonov, Mete Ozay, Eunju Yang, Jijoong Moon, Umberto Michieli

TL;DR

The paper tackles the challenge of catastrophic forgetting in one-shot Few-Shot Class-Incremental Learning (OSCIL) for on-device personalization. It introduces Novel Class Detection (NCD), a threshold-based inference rule that partitions decisions between base and novel prototypes and allows controllable forgetting by selecting a distance threshold $\alpha$ to meet a forgetting budget. Through extensive experiments across backbones and datasets, the authors show that NCD yields substantial gains in novel-class accuracy (NCR) while keeping base-class forgetting within predefined limits, with notable improvements in ultra-low-shot regimes (e.g., 1-shot, 1 novel class). The work highlights the practical impact of predictable QoS on-device, and demonstrates that the approach is compatible with various base-training strategies and can even support OOD detection as a by-product. Overall, the proposed NCD framework provides a simple, effective, and adaptable solution to balance continual adaptation and memory of base knowledge in OSCIL.

Abstract

Class-incremental learning in the context of limited personal labeled samples (few-shot) is critical for numerous real-world applications, such as smart home devices. A key challenge in these scenarios is balancing the trade-off between adapting to new, personalized classes and maintaining the performance of the model on the original, base classes. Fine-tuning the model on novel classes often leads to the phenomenon of catastrophic forgetting, where the accuracy of base classes declines unpredictably and significantly. In this paper, we propose a simple yet effective mechanism to address this challenge by controlling the trade-off between novel and base class accuracy. We specifically target the ultra-low-shot scenario, where only a single example is available per novel class. Our approach introduces a Novel Class Detection (NCD) rule, which adjusts the degree of forgetting a priori while simultaneously enhancing performance on novel classes. We demonstrate the versatility of our solution by applying it to state-of-the-art Few-Shot Class-Incremental Learning (FSCIL) methods, showing consistent improvements across different settings. To better quantify the trade-off between novel and base class performance, we introduce new metrics: NCR@2FOR and NCR@5FOR. Our approach achieves up to a 30% improvement in novel class accuracy on the CIFAR100 dataset (1-shot, 1 novel class) while maintaining a controlled base class forgetting rate of 2%.

Controllable Forgetting Mechanism for Few-Shot Class-Incremental Learning

TL;DR

The paper tackles the challenge of catastrophic forgetting in one-shot Few-Shot Class-Incremental Learning (OSCIL) for on-device personalization. It introduces Novel Class Detection (NCD), a threshold-based inference rule that partitions decisions between base and novel prototypes and allows controllable forgetting by selecting a distance threshold to meet a forgetting budget. Through extensive experiments across backbones and datasets, the authors show that NCD yields substantial gains in novel-class accuracy (NCR) while keeping base-class forgetting within predefined limits, with notable improvements in ultra-low-shot regimes (e.g., 1-shot, 1 novel class). The work highlights the practical impact of predictable QoS on-device, and demonstrates that the approach is compatible with various base-training strategies and can even support OOD detection as a by-product. Overall, the proposed NCD framework provides a simple, effective, and adaptable solution to balance continual adaptation and memory of base knowledge in OSCIL.

Abstract

Class-incremental learning in the context of limited personal labeled samples (few-shot) is critical for numerous real-world applications, such as smart home devices. A key challenge in these scenarios is balancing the trade-off between adapting to new, personalized classes and maintaining the performance of the model on the original, base classes. Fine-tuning the model on novel classes often leads to the phenomenon of catastrophic forgetting, where the accuracy of base classes declines unpredictably and significantly. In this paper, we propose a simple yet effective mechanism to address this challenge by controlling the trade-off between novel and base class accuracy. We specifically target the ultra-low-shot scenario, where only a single example is available per novel class. Our approach introduces a Novel Class Detection (NCD) rule, which adjusts the degree of forgetting a priori while simultaneously enhancing performance on novel classes. We demonstrate the versatility of our solution by applying it to state-of-the-art Few-Shot Class-Incremental Learning (FSCIL) methods, showing consistent improvements across different settings. To better quantify the trade-off between novel and base class performance, we introduce new metrics: NCR@2FOR and NCR@5FOR. Our approach achieves up to a 30% improvement in novel class accuracy on the CIFAR100 dataset (1-shot, 1 novel class) while maintaining a controlled base class forgetting rate of 2%.

Paper Structure

This paper contains 13 sections, 4 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Setup for FSCIL with $K$ shots. A base training session is usually done on the server and multiple incremental training sessions are usually done on device with a few annotated samples (i.e., the support set) from novel classes. In our paper, we focus on the one-shot case ($K=1$).
  • Figure 2: Overview of our method. Left: base training session (Sec \ref{['sec:base_train']}), e.g., based on ProtoNet snell2017prototypical, SAVC song2023learning, FACT zhou2022forward, OrCo ahmed2024OrCo. Middle: incremental training session (Sec \ref{['sec:incremental_train']}) with frozen backbone. Right: inference stage (Sec \ref{['sec:inference']}), where our NCD Decision Rule controls the inference logic flow between branches of base and novel classes.
  • Figure 3: Comparison between vanilla and NCD-based inference methods. Left: NCR for increasing number of novel classes $N_1$ with $K=1$. Right: NCR accuracy for increasing number of shots $K$ with $N_1=5$. Experiments are done on ResNet18-PN backbone with CUB200 dataset.