Table of Contents
Fetching ...

CALA: A Class-Aware Logit Adapter for Few-Shot Class-Incremental Learning

Chengyan Liu, Linglan Zhao, Fan Lyu, Kaile Du, Fuyuan Hu, Tao Zhou

TL;DR

This work tackles Few-Shot Class-Incremental Learning (FSCIL) where a backbone trained on base data is frozen, causing base-class bias and novel-class confusion. It introduces Class-Aware Logit Adapter (CALA), a lightweight classifier-level module learned through pseudo-incremental training that outputs per-class balancing factors $\boldsymbol{\beta}$ from base–novel similarities and applies them to final logits via $\hat{\mathbf{z}}=\mathbf{z}+[\mathbf{0}, \gamma\boldsymbol{\beta}]$, enabling dynamic, class-aware calibration. The balancing factors are produced by an MLP $g_\phi(\cdot)$ fed with a cosine-based similarity matrix $\tilde{\mathbf{S}}_c$ between fake novel prototypes and base prototypes, and CALA is designed as a plug-and-play module that preserves the feature space. Empirical results on mini-ImageNet, CIFAR-100, and CUB-200 show CALA consistently outperforms SOTAs, reduces novel-to-base misclassification, and generalizes across different FSCIL baselines, demonstrating practical impact in reducing novel-class confusion without retraining the backbone.

Abstract

Few-Shot Class-Incremental Learning (FSCIL) defines a practical but challenging task where models are required to continuously learn novel concepts with only a few training samples. Due to data scarcity, existing FSCIL methods resort to training a backbone with abundant base data and then keeping it frozen afterward. However, the above operation often causes the backbone to overfit to base classes while overlooking the novel ones, leading to severe confusion between them. To address this issue, we propose Class-Aware Logit Adapter (CALA). Our method involves a lightweight adapter that learns to rectify biased predictions through a pseudo-incremental learning paradigm. In the real FSCIL process, we use the learned adapter to dynamically generate robust balancing factors. These factors can adjust confused novel instances back to their true label space based on their similarity to base classes. Specifically, when confusion is more likely to occur in novel instances that closely resemble base classes, greater rectification is required. Notably, CALA operates on the classifier level, preserving the original feature space, thus it can be flexibly plugged into most of the existing FSCIL works for improved performance. Experiments on three benchmark datasets consistently validate the effectiveness and flexibility of CALA. Codes will be available upon acceptance.

CALA: A Class-Aware Logit Adapter for Few-Shot Class-Incremental Learning

TL;DR

This work tackles Few-Shot Class-Incremental Learning (FSCIL) where a backbone trained on base data is frozen, causing base-class bias and novel-class confusion. It introduces Class-Aware Logit Adapter (CALA), a lightweight classifier-level module learned through pseudo-incremental training that outputs per-class balancing factors from base–novel similarities and applies them to final logits via , enabling dynamic, class-aware calibration. The balancing factors are produced by an MLP fed with a cosine-based similarity matrix between fake novel prototypes and base prototypes, and CALA is designed as a plug-and-play module that preserves the feature space. Empirical results on mini-ImageNet, CIFAR-100, and CUB-200 show CALA consistently outperforms SOTAs, reduces novel-to-base misclassification, and generalizes across different FSCIL baselines, demonstrating practical impact in reducing novel-class confusion without retraining the backbone.

Abstract

Few-Shot Class-Incremental Learning (FSCIL) defines a practical but challenging task where models are required to continuously learn novel concepts with only a few training samples. Due to data scarcity, existing FSCIL methods resort to training a backbone with abundant base data and then keeping it frozen afterward. However, the above operation often causes the backbone to overfit to base classes while overlooking the novel ones, leading to severe confusion between them. To address this issue, we propose Class-Aware Logit Adapter (CALA). Our method involves a lightweight adapter that learns to rectify biased predictions through a pseudo-incremental learning paradigm. In the real FSCIL process, we use the learned adapter to dynamically generate robust balancing factors. These factors can adjust confused novel instances back to their true label space based on their similarity to base classes. Specifically, when confusion is more likely to occur in novel instances that closely resemble base classes, greater rectification is required. Notably, CALA operates on the classifier level, preserving the original feature space, thus it can be flexibly plugged into most of the existing FSCIL works for improved performance. Experiments on three benchmark datasets consistently validate the effectiveness and flexibility of CALA. Codes will be available upon acceptance.

Paper Structure

This paper contains 18 sections, 11 equations, 7 figures, 4 tables, 1 algorithm.

Figures (7)

  • Figure 1: Comparisons of (a) previous works with the incremental-frozen framework and (b) our class-aware logit adaptor (CALA) for FSCIL from both the logit view and the feature space view.
  • Figure 2: An overview of our method CALA. (a) In the base session, we use sufficient base data to pre-train a generalizable backbone that will be frozen in all subsequent processes. (b) In the upper branch, i.e., the pseudo-training stage, we mix up base data to create fake novel data and mimic an FSCIL process to train a relatively robust class-aware logit adapter. In the lower branch, i.e., during real FSCIL, we use the adapter to calculate a class-aware $\mathbf{\beta}$, and rectify the final logit in the testing stage.
  • Figure 3: Comparison with the state-of-the-art works on the other two benchmarks: (a) CIFAR100 and (b) CUB200.
  • Figure 4: Ablation study on the logit distribution before and after our class-aware logit adjustment strategy on CIFAR100. (a) The logit distribution of a baseline is based on the incremental-frozen framework which lacks logit adjustment. (b) The logit distribution of the same baseline with the logit adjustment of our method.
  • Figure 5: Confusion matrices without and with CALA on mini-ImageNet. Red lines are used to separate base classes and novel classes. CALA effectively improves the prediction accuracy in novel classes, resulting in a bluer diagonal in the last 40 classes.
  • ...and 2 more figures