Table of Contents
Fetching ...

Compositional Few-Shot Class-Incremental Learning

Yixiong Zou, Shanghang Zhang, Haichen Zhou, Yuhua Li, Ruixuan Li

TL;DR

The paper tackles FSCIL by introducing a compositional learning framework that decomposes classes into reusable visual primitives (patch-based features) and models class recognition as a composition over primitive sets. It leverages a Centered Kernel Alignment (CKA) based set similarity to score how well input primitives align with class primitives, and adds a primitive reuse module that swaps primitives with closest counterparts from other classes to enhance reuse and reduce forgetting. Training optimizes a trio of losses ($L_{cls}$, $L_{cmp}$, $L_{rcmp}$) to balance accurate base/classification with effective composition and cross-class reuse, enabling robust incremental learning across multiple datasets. Empirical results on miniImageNet, CIFAR100, and CUB200 demonstrate state-of-the-art last-session accuracy and improved interpretability through visible primitive parts and composed explanations; the method also provides insights into the role of primitives and their reusability in FSCIL. The approach offers practical impact by producing a more transparent and scalable FSCIL framework with potential extensions to broader few-shot and many-shot scenarios.

Abstract

Few-shot class-incremental learning (FSCIL) is proposed to continually learn from novel classes with only a few samples after the (pre-)training on base classes with sufficient data. However, this remains a challenge. In contrast, humans can easily recognize novel classes with a few samples. Cognitive science demonstrates that an important component of such human capability is compositional learning. This involves identifying visual primitives from learned knowledge and then composing new concepts using these transferred primitives, making incremental learning both effective and interpretable. To imitate human compositional learning, we propose a cognitive-inspired method for the FSCIL task. We define and build a compositional model based on set similarities, and then equip it with a primitive composition module and a primitive reuse module. In the primitive composition module, we propose to utilize the Centered Kernel Alignment (CKA) similarity to approximate the similarity between primitive sets, allowing the training and evaluation based on primitive compositions. In the primitive reuse module, we enhance primitive reusability by classifying inputs based on primitives replaced with the closest primitives from other classes. Experiments on three datasets validate our method, showing it outperforms current state-of-the-art methods with improved interpretability. Our code is available at https://github.com/Zoilsen/Comp-FSCIL.

Compositional Few-Shot Class-Incremental Learning

TL;DR

The paper tackles FSCIL by introducing a compositional learning framework that decomposes classes into reusable visual primitives (patch-based features) and models class recognition as a composition over primitive sets. It leverages a Centered Kernel Alignment (CKA) based set similarity to score how well input primitives align with class primitives, and adds a primitive reuse module that swaps primitives with closest counterparts from other classes to enhance reuse and reduce forgetting. Training optimizes a trio of losses (, , ) to balance accurate base/classification with effective composition and cross-class reuse, enabling robust incremental learning across multiple datasets. Empirical results on miniImageNet, CIFAR100, and CUB200 demonstrate state-of-the-art last-session accuracy and improved interpretability through visible primitive parts and composed explanations; the method also provides insights into the role of primitives and their reusability in FSCIL. The approach offers practical impact by producing a more transparent and scalable FSCIL framework with potential extensions to broader few-shot and many-shot scenarios.

Abstract

Few-shot class-incremental learning (FSCIL) is proposed to continually learn from novel classes with only a few samples after the (pre-)training on base classes with sufficient data. However, this remains a challenge. In contrast, humans can easily recognize novel classes with a few samples. Cognitive science demonstrates that an important component of such human capability is compositional learning. This involves identifying visual primitives from learned knowledge and then composing new concepts using these transferred primitives, making incremental learning both effective and interpretable. To imitate human compositional learning, we propose a cognitive-inspired method for the FSCIL task. We define and build a compositional model based on set similarities, and then equip it with a primitive composition module and a primitive reuse module. In the primitive composition module, we propose to utilize the Centered Kernel Alignment (CKA) similarity to approximate the similarity between primitive sets, allowing the training and evaluation based on primitive compositions. In the primitive reuse module, we enhance primitive reusability by classifying inputs based on primitives replaced with the closest primitives from other classes. Experiments on three datasets validate our method, showing it outperforms current state-of-the-art methods with improved interpretability. Our code is available at https://github.com/Zoilsen/Comp-FSCIL.
Paper Structure (27 sections, 15 equations, 12 figures, 9 tables)

This paper contains 27 sections, 15 equations, 12 figures, 9 tables.

Figures (12)

  • Figure 1: As studied by cognitive science biederman1987recognition, humans can compositionally learn knowledge by dividing learned ones into primitives, and then compose them to learn novel knowledge, which leads to the good human capability of incremental learning with only scarce data. To imitate the human ability of compositional learning, we propose a compositional learning method for the few-shot class-incremental learning (FSCIL) task. We briefly plot the primitives automatically found by our methods with the possible meanings, where we can see good reusability and interpretability of primitives. Detailed plots are in Fig.\ref{['fig:composition']}.
  • Figure 1: Samples of miniImageNet.
  • Figure 2: We take image patches as candidate primitives, and utilize a set of prototypes to construct the primitive set for each class. Given an input sample, our method tries to compose it with primitive sets (e.g., $Z^A$ and $Z^B$) from different classes (e.g., class A and B), which is measured as the composition score by the CKA similarity. These composition scores are then utilized to be the classification score for the model training and evaluation. To improve the reusability of primitives across classes, each primitive is replaced with the closest primitive in other classes. The replaced primitive sets (e.g., $\tilde{Z}^A$ and $\tilde{Z}^B$) will then be applied in the classification with the composition score. Finally, our model is trained with the combination of $L_{cls}$, $L_{cmp}$, and $L_{rcmp}$ during both the base and incremental sessions.
  • Figure 2: Samples of CIFAR100.
  • Figure 3: Visualization of class-activation-map (CAM). BL: Baseline model; CF: Our compositional model. CF-CAM activates smaller regions than BL-CAM and filters out sample-specific regions such as background, validating the focus on shared patches.
  • ...and 7 more figures

Theorems & Definitions (2)

  • Definition 3.1
  • Definition 3.2