Table of Contents
Fetching ...

Unlocking Prototype Potential: An Efficient Tuning Framework for Few-Shot Class-Incremental Learning

Shengqin Jiang, Xiaoran Feng, Yuankai Qi, Haokui Zhang, Renlong Hang, Qingshan Liu, Lina Yao, Quan Z. Sheng, Ming-Hsuan Yang

TL;DR

The paper tackles few-shot class-incremental learning by shifting focus from backbone fine-tuning to prototype calibration within a strong, frozen feature space. It introduces a dual-offset prototype refinement (class-specific and task-aware) plus a negative error projector to model inter-prototype relationships, enabling robust incremental learning with very few parameters. Empirical results on multiple benchmarks show state-of-the-art performance, especially with large self-supervised backbones like DINOv3, and ablations confirm the effectiveness of both offsets and the NEP. The approach offers practical benefits for data-scarce continual learning scenarios where maintaining previously learned knowledge is crucial.

Abstract

Few-shot class-incremental learning (FSCIL) seeks to continuously learn new classes from very limited samples while preserving previously acquired knowledge. Traditional methods often utilize a frozen pre-trained feature extractor to generate static class prototypes, which suffer from the inherent representation bias of the backbone. While recent prompt-based tuning methods attempt to adapt the backbone via minimal parameter updates, given the constraint of extreme data scarcity, the model's capacity to assimilate novel information and substantively enhance its global discriminative power is inherently limited. In this paper, we propose a novel shift in perspective: freezing the feature extractor while fine-tuning the prototypes. We argue that the primary challenge in FSCIL is not feature acquisition, but rather the optimization of decision regions within a static, high-quality feature space. To this end, we introduce an efficient prototype fine-tuning framework that evolves static centroids into dynamic, learnable components. The framework employs a dual-calibration method consisting of class-specific and task-aware offsets. These components function synergistically to improve the discriminative capacity of prototypes for ongoing incremental classes. Extensive results demonstrate that our method attains superior performance across multiple benchmarks while requiring minimal learnable parameters.

Unlocking Prototype Potential: An Efficient Tuning Framework for Few-Shot Class-Incremental Learning

TL;DR

The paper tackles few-shot class-incremental learning by shifting focus from backbone fine-tuning to prototype calibration within a strong, frozen feature space. It introduces a dual-offset prototype refinement (class-specific and task-aware) plus a negative error projector to model inter-prototype relationships, enabling robust incremental learning with very few parameters. Empirical results on multiple benchmarks show state-of-the-art performance, especially with large self-supervised backbones like DINOv3, and ablations confirm the effectiveness of both offsets and the NEP. The approach offers practical benefits for data-scarce continual learning scenarios where maintaining previously learned knowledge is crucial.

Abstract

Few-shot class-incremental learning (FSCIL) seeks to continuously learn new classes from very limited samples while preserving previously acquired knowledge. Traditional methods often utilize a frozen pre-trained feature extractor to generate static class prototypes, which suffer from the inherent representation bias of the backbone. While recent prompt-based tuning methods attempt to adapt the backbone via minimal parameter updates, given the constraint of extreme data scarcity, the model's capacity to assimilate novel information and substantively enhance its global discriminative power is inherently limited. In this paper, we propose a novel shift in perspective: freezing the feature extractor while fine-tuning the prototypes. We argue that the primary challenge in FSCIL is not feature acquisition, but rather the optimization of decision regions within a static, high-quality feature space. To this end, we introduce an efficient prototype fine-tuning framework that evolves static centroids into dynamic, learnable components. The framework employs a dual-calibration method consisting of class-specific and task-aware offsets. These components function synergistically to improve the discriminative capacity of prototypes for ongoing incremental classes. Extensive results demonstrate that our method attains superior performance across multiple benchmarks while requiring minimal learnable parameters.
Paper Structure (15 sections, 7 equations, 6 figures, 2 tables)

This paper contains 15 sections, 7 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Performance comparison with Teen wang2023few, ASP liu2024few and SEC liu2025sec on CUB-200. “Base” and “Average” indicate the accuracy of the initial stage and the mean accuracy across all incremental stages, respectively. Base Model (B.M.) denotes the model trained exclusively on base-stage data, whereas Full Model (F.M.) represents the model trained on the incremental data across all stages. For average (B.M.), the evaluation in incremental stages is conducted using prototypes generated for novel classes based on the frozen base model. Even with extremely scarce samples in the incremental phase, our model still significantly improves performance after base class learning compared to other methods.
  • Figure 2: Overview of the proposed framework for few-shot class-incremental learning. We propose a prototype fine-tuning framework built upon a frozen pre-trained backbone. Within this architecture, each prototype is decomposed into two constituent elements for each class: a base prototype and a learnable offset. The base prototype is initialized via the global average of features extracted from the training dataset. The offset is further subdivided into a category-specific offset (sub-Sec. \ref{['CSO']}) and a task-aware offset (sub-Sec. \ref{['TAO']}), both of which are fine-tuned to calibrate the prototype and enhance its representational fidelity. Finally, query features are projected into the calibrated prototype space via a negative error projector (sub-Sec. \ref{['NEP']}) for final prediction.
  • Figure 3: Performance comparison against SOTA methods on ImageNet-R, VTAB, and ImageNet-A.
  • Figure 4: Performance comparison between our proposed method and the baseline across various pre-trained backbones. Last denotes accuracy on the final task, and Avg denotes average accuracy across all tasks.
  • Figure 5: Performance comparison of different distance metrics for prototype-based classification. ED and NEP denote Euclidean distance and negative error projector, respectively. The plot shows the accuracy for the final task (Last Acc) and the average accuracy (Avg Acc) across all tasks.
  • ...and 1 more figures