Class-Independent Increment: An Efficient Approach for Multi-label Class-Incremental Learning
Chenhao Ding, Songlin Dong, Zhengdong Zhou, Jizhou Han, Qiang Wang, Yuhang He, Yihong Gong
TL;DR
The paper tackles multi-label class-incremental learning (MLCIL) by addressing inter-session and intra-feature confusion through a class-independent incremental network (CINet) that produces per-class embeddings via class-specific tokens and a class-level cross-attention mechanism. It introduces two losses, including a multi-label contrastive loss, to separate old and new concepts while preserving learned knowledge, and demonstrates strong performance on MS-COCO and PASCAL VOC, including challenging settings and buffer-free scenarios. The key contributions are the CINet architecture, a scalable token-based representation for multi-labels, and a dedicated loss framework that improves forgetting resistance and recognition accuracy across incremental sessions. The results indicate practical impact for real-world multi-label applications, offering a robust approach to continual learning with reduced memory footprints and improved resilience to feature confusion.
Abstract
Current research on class-incremental learning primarily focuses on single-label classification tasks. However, real-world applications often involve multi-label scenarios, such as image retrieval and medical imaging. Therefore, this paper focuses on the challenging yet practical multi-label class-incremental learning (MLCIL) problem. In addition to the challenge of catastrophic forgetting, MLCIL encounters issues related to feature confusion, encompassing inter-session and intra-feature confusion. To address these problems, we propose a novel MLCIL approach called class-independent increment (CLIN). Specifically, in contrast to existing methods that extract image-level features, we propose a class-independent incremental network (CINet) to extract multiple class-level embeddings for multi-label samples. It learns and preserves the knowledge of different classes by constructing class-specific tokens. On this basis, we develop two novel loss functions, optimizing the learning of class-specific tokens and class-level embeddings, respectively. These losses aim to distinguish between new and old classes, further alleviating the problem of feature confusion. Extensive experiments on MS-COCO and PASCAL VOC datasets demonstrate the effectiveness of our method for improving recognition performance and mitigating forgetting on various MLCIL tasks.
