Table of Contents
Fetching ...

An experimental approach on Few Shot Class Incremental Learning

Marinela Adam

TL;DR

This work investigates Few-Shot Class-Incremental Learning (FSCIL) and the persistent challenge of catastrophic forgetting. It proposes enhancing the Learning Prompt with Distribution-based Feature Replay (LP-DiF) by substituting the vision-language model CLIP with CLOOB, aiming to improve zero-shot and few-shot retention while leveraging a Variational Autoencoder–based pseudo-feature replay mechanism. Through systematic experiments on CIFAR-100, mini-ImageNet, and CUB-200, the study demonstrates that CLOOB can outperform CLIP with larger encoders and, when integrated into LP-DiF, yields competitive or superior results relative to current SOTA FSCIL methods such as SV-T, LIMIT, and S2C. The findings suggest that CLOOB-powered LP-DiF enhances robustness to distribution shifts and improves knowledge retention across incremental tasks, with practical implications for scalable, continual learning in dynamic environments. Limitations include restricted backbone choices and reliance on pretraining datasets, motivating future work on broader backbones, task-free settings, and cross-domain extensions.

Abstract

Few-Shot Class-Incremental Learning (FSCIL) represents a cutting-edge paradigm within the broader scope of machine learning, designed to empower models with the ability to assimilate new classes of data with limited examples while safeguarding existing knowledge. The paper will present different solutions which contain extensive experiments across large-scale datasets, domain shifts, and network architectures to evaluate and compare the selected methods. We highlight their advantages and then present an experimental approach with the purpose of improving the most promising one by replacing the visual-language (V-L) model (CLIP) with another V-L model (CLOOB) that seem to outperform it on zero-shot learning tasks. The aim of this report is to present an experimental method for FSCIL that would improve its performance. We also plan to offer an overview followed by an analysis of the recent advancements in FSCIL domain, focusing on various strategies to mitigate catastrophic forgetting and improve the adaptability of models to evolving tasks and datasets.

An experimental approach on Few Shot Class Incremental Learning

TL;DR

This work investigates Few-Shot Class-Incremental Learning (FSCIL) and the persistent challenge of catastrophic forgetting. It proposes enhancing the Learning Prompt with Distribution-based Feature Replay (LP-DiF) by substituting the vision-language model CLIP with CLOOB, aiming to improve zero-shot and few-shot retention while leveraging a Variational Autoencoder–based pseudo-feature replay mechanism. Through systematic experiments on CIFAR-100, mini-ImageNet, and CUB-200, the study demonstrates that CLOOB can outperform CLIP with larger encoders and, when integrated into LP-DiF, yields competitive or superior results relative to current SOTA FSCIL methods such as SV-T, LIMIT, and S2C. The findings suggest that CLOOB-powered LP-DiF enhances robustness to distribution shifts and improves knowledge retention across incremental tasks, with practical implications for scalable, continual learning in dynamic environments. Limitations include restricted backbone choices and reliance on pretraining datasets, motivating future work on broader backbones, task-free settings, and cross-domain extensions.

Abstract

Few-Shot Class-Incremental Learning (FSCIL) represents a cutting-edge paradigm within the broader scope of machine learning, designed to empower models with the ability to assimilate new classes of data with limited examples while safeguarding existing knowledge. The paper will present different solutions which contain extensive experiments across large-scale datasets, domain shifts, and network architectures to evaluate and compare the selected methods. We highlight their advantages and then present an experimental approach with the purpose of improving the most promising one by replacing the visual-language (V-L) model (CLIP) with another V-L model (CLOOB) that seem to outperform it on zero-shot learning tasks. The aim of this report is to present an experimental method for FSCIL that would improve its performance. We also plan to offer an overview followed by an analysis of the recent advancements in FSCIL domain, focusing on various strategies to mitigate catastrophic forgetting and improve the adaptability of models to evolving tasks and datasets.

Paper Structure

This paper contains 40 sections, 1 equation, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Accuracy curves LP-DiF and comparison with counterparts on SUN-397 and CUB200* datasets
  • Figure 2: Illustration of LIMIT. Left: We sample fake-incremental tasks from the base training set D0 , forming various fake-tasks. Right: Meta-calibration process. The model needs to calibrate between old class classifiers and new class prototypes with a set-to-set function. We also input the query instance embedding into the meta-calibration module to contextualize the classification task, generating instance-specific embeddings. The final logit is calculated by the inner-product of the transformed classifier and transformed query embedding
  • Figure 3: Illustration of our proposed S2C for FSCIL. Top: the setting of FSCIL. Bottom: Sample-level to Class-level graphs.
  • Figure 4: FSCIL approaches comparison