Compositional Few-Shot Class-Incremental Learning
Yixiong Zou, Shanghang Zhang, Haichen Zhou, Yuhua Li, Ruixuan Li
TL;DR
The paper tackles FSCIL by introducing a compositional learning framework that decomposes classes into reusable visual primitives (patch-based features) and models class recognition as a composition over primitive sets. It leverages a Centered Kernel Alignment (CKA) based set similarity to score how well input primitives align with class primitives, and adds a primitive reuse module that swaps primitives with closest counterparts from other classes to enhance reuse and reduce forgetting. Training optimizes a trio of losses ($L_{cls}$, $L_{cmp}$, $L_{rcmp}$) to balance accurate base/classification with effective composition and cross-class reuse, enabling robust incremental learning across multiple datasets. Empirical results on miniImageNet, CIFAR100, and CUB200 demonstrate state-of-the-art last-session accuracy and improved interpretability through visible primitive parts and composed explanations; the method also provides insights into the role of primitives and their reusability in FSCIL. The approach offers practical impact by producing a more transparent and scalable FSCIL framework with potential extensions to broader few-shot and many-shot scenarios.
Abstract
Few-shot class-incremental learning (FSCIL) is proposed to continually learn from novel classes with only a few samples after the (pre-)training on base classes with sufficient data. However, this remains a challenge. In contrast, humans can easily recognize novel classes with a few samples. Cognitive science demonstrates that an important component of such human capability is compositional learning. This involves identifying visual primitives from learned knowledge and then composing new concepts using these transferred primitives, making incremental learning both effective and interpretable. To imitate human compositional learning, we propose a cognitive-inspired method for the FSCIL task. We define and build a compositional model based on set similarities, and then equip it with a primitive composition module and a primitive reuse module. In the primitive composition module, we propose to utilize the Centered Kernel Alignment (CKA) similarity to approximate the similarity between primitive sets, allowing the training and evaluation based on primitive compositions. In the primitive reuse module, we enhance primitive reusability by classifying inputs based on primitives replaced with the closest primitives from other classes. Experiments on three datasets validate our method, showing it outperforms current state-of-the-art methods with improved interpretability. Our code is available at https://github.com/Zoilsen/Comp-FSCIL.
