Table of Contents
Fetching ...

Rebalancing Multi-Label Class-Incremental Learning

Kaile Du, Yifan Zhou, Fan Lyu, Yuyang Li, Junzhou Xie, Yixi Shen, Fuyuan Hu, Guangcan Liu

TL;DR

The paper tackles the inherent positive-negative imbalance in multi-label class-incremental learning under partial labeling by proposing RebLL, a two-component framework that combines asymmetric knowledge distillation (AKD) for loss rebalance with online relabeling (OR) for label rebalance. AKD emphasizes learning from negative labels and down-weights overconfident old-task predictions, while OR reconstructs the original label distribution in memory through online relabeling of missing labels. Empirical results on PASCAL VOC and MS-COCO show state-of-the-art performance across challenging incremental scenarios with a vanilla CNN backbone, along with strong anti-forgetting and improved multi-label accuracy measures such as mAP, CF1, and OF1. Overall, RebLL provides a practical and effective route to balance loss and labels in PL-based MLCIL, though the authors note that gains may be less pronounced in very simple CIL tasks.

Abstract

Multi-label class-incremental learning (MLCIL) is essential for real-world multi-label applications, allowing models to learn new labels while retaining previously learned knowledge continuously. However, recent MLCIL approaches can only achieve suboptimal performance due to the oversight of the positive-negative imbalance problem, which manifests at both the label and loss levels because of the task-level partial label issue. The imbalance at the label level arises from the substantial absence of negative labels, while the imbalance at the loss level stems from the asymmetric contributions of the positive and negative loss parts to the optimization. To address the issue above, we propose a Rebalance framework for both the Loss and Label levels (RebLL), which integrates two key modules: asymmetric knowledge distillation (AKD) and online relabeling (OR). AKD is proposed to rebalance at the loss level by emphasizing the negative label learning in classification loss and down-weighting the contribution of overconfident predictions in distillation loss. OR is designed for label rebalance, which restores the original class distribution in memory by online relabeling the missing classes. Our comprehensive experiments on the PASCAL VOC and MS-COCO datasets demonstrate that this rebalancing strategy significantly improves performance, achieving new state-of-the-art results even with a vanilla CNN backbone.

Rebalancing Multi-Label Class-Incremental Learning

TL;DR

The paper tackles the inherent positive-negative imbalance in multi-label class-incremental learning under partial labeling by proposing RebLL, a two-component framework that combines asymmetric knowledge distillation (AKD) for loss rebalance with online relabeling (OR) for label rebalance. AKD emphasizes learning from negative labels and down-weights overconfident old-task predictions, while OR reconstructs the original label distribution in memory through online relabeling of missing labels. Empirical results on PASCAL VOC and MS-COCO show state-of-the-art performance across challenging incremental scenarios with a vanilla CNN backbone, along with strong anti-forgetting and improved multi-label accuracy measures such as mAP, CF1, and OF1. Overall, RebLL provides a practical and effective route to balance loss and labels in PL-based MLCIL, though the authors note that gains may be less pronounced in very simple CIL tasks.

Abstract

Multi-label class-incremental learning (MLCIL) is essential for real-world multi-label applications, allowing models to learn new labels while retaining previously learned knowledge continuously. However, recent MLCIL approaches can only achieve suboptimal performance due to the oversight of the positive-negative imbalance problem, which manifests at both the label and loss levels because of the task-level partial label issue. The imbalance at the label level arises from the substantial absence of negative labels, while the imbalance at the loss level stems from the asymmetric contributions of the positive and negative loss parts to the optimization. To address the issue above, we propose a Rebalance framework for both the Loss and Label levels (RebLL), which integrates two key modules: asymmetric knowledge distillation (AKD) and online relabeling (OR). AKD is proposed to rebalance at the loss level by emphasizing the negative label learning in classification loss and down-weighting the contribution of overconfident predictions in distillation loss. OR is designed for label rebalance, which restores the original class distribution in memory by online relabeling the missing classes. Our comprehensive experiments on the PASCAL VOC and MS-COCO datasets demonstrate that this rebalancing strategy significantly improves performance, achieving new state-of-the-art results even with a vanilla CNN backbone.
Paper Structure (18 sections, 9 equations, 6 figures, 6 tables)

This paper contains 18 sections, 9 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: A diagram of multi-label class-incremental learning. Labels are trained and tested across tasks from Task $1$ to Task $T$. Missing labels are highlighted in red. The training label space is task-specific, while the testing label space progressively expands with the addition of each new task.
  • Figure 2: (a) The RebLL framework consists of two modules: AKD and OR. In AKD, a training image is fed into both old ($f^{T-1}_{\theta}$) and new ($f^{T}_{\theta}$) models, where the output old task predictions (in green) are used to compute $L_{\text{akd}}$. The new task predictions (in blue) are compared with the new label to compute $L_{\text{cls}}$. In OR, the partially labeled memory sample is relabeled to form the full label, which is then used in conjunction with the predictions to compute $L_{\text{er}}$. (b) The quantitative comparison between KD, CSC and AKD after training on the final task in {B4-C2} of VOC 2007.
  • Figure 3: Online relabeling. For the label block matrix, missing new task labels above the main diagonal are relabeled using the trained current model, while the missing old task labels below the main diagonal are relabeled using the past model. This process reduces the FPR from 6.7% to 2.4%.
  • Figure 4: Incremental results on PASCAL VOC in challenging scenarios. There are more tasks in these scenarios.
  • Figure 5: Analysis of $\alpha$ and $\beta$ for exponential decay factor.
  • ...and 1 more figures