Class Balance Matters to Active Class-Incremental Learning
Zitong Huang, Ze Chen, Yuanze Li, Bowen Dong, Erjin Zhou, Yong Liu, Rick Siow Mong Goh, Chun-Mei Feng, Wangmeng Zuo
TL;DR
The paper tackles Active Class-Incremental Learning (ACIL) by addressing the tendency of traditional active learning to produce class-imbalanced labeled sets that degrade incremental learning. It introduces Class-Balanced Selection (CBS), a clustering-based, KL-divergence-guided greedy sampling strategy that aligns the distribution of selected samples with the unlabeled pool while preserving informativeness, and demonstrates its plug-and-play compatibility with pretrained-model–based CIL methods using prompt tuning (e.g., L2P, DualPrompt, LP-DiF). CBS consistently outperforms random sampling and existing active-learning baselines across five datasets under varying labeling budgets, and gains further when combined with LP-DiF’s unlabeled-data replay mechanism. The work shows that balancing class representation in the annotated pool is crucial for high-quality incremental learning, offering a practical approach to reduce labeling costs while maintaining strong performance in dynamic, multi-session settings.
Abstract
Few-Shot Class-Incremental Learning has shown remarkable efficacy in efficient learning new concepts with limited annotations. Nevertheless, the heuristic few-shot annotations may not always cover the most informative samples, which largely restricts the capability of incremental learner. We aim to start from a pool of large-scale unlabeled data and then annotate the most informative samples for incremental learning. Based on this premise, this paper introduces the Active Class-Incremental Learning (ACIL). The objective of ACIL is to select the most informative samples from the unlabeled pool to effectively train an incremental learner, aiming to maximize the performance of the resulting model. Note that vanilla active learning algorithms suffer from class-imbalanced distribution among annotated samples, which restricts the ability of incremental learning. To achieve both class balance and informativeness in chosen samples, we propose Class-Balanced Selection (CBS) strategy. Specifically, we first cluster the features of all unlabeled images into multiple groups. Then for each cluster, we employ greedy selection strategy to ensure that the Gaussian distribution of the sampled features closely matches the Gaussian distribution of all unlabeled features within the cluster. Our CBS can be plugged and played into those CIL methods which are based on pretrained models with prompts tunning technique. Extensive experiments under ACIL protocol across five diverse datasets demonstrate that CBS outperforms both random selection and other SOTA active learning approaches. Code is publicly available at https://github.com/1170300714/CBS.
