Table of Contents
Fetching ...

Enhancing Consistency and Mitigating Bias: A Data Replay Approach for Incremental Learning

Chenyang Wang, Junjun Jiang, Xingyu Hu, Xianming Liu, Xiangyang Ji

TL;DR

This paper tackles catastrophic forgetting in class incremental learning under data privacy constraints by focusing on data-free replay through model inversion. It introduces CwD, a Consistency-enhanced data replay framework with a Debiased classifier that adds a data-consistency enhancement loss (DCE) to align inverted and real data distributions under a tied multivariate Gaussian assumption, and a weight alignment regularization (WAR) to balance class weights during training. An extra estimation stage collects old-task statistics to improve future inversions, forming a three-stage pipeline (Inversion, Training, Estimation). Across CIFAR-100, Tiny-ImageNet, and ImageNet-100, CwD consistently improves last-task accuracy and average performance over prior data-free baselines and can boost non-data-free baselines when combined. The work also provides thorough ablations, debiasing comparisons, and overhead analyses, highlighting practical gains and areas for further refinement in data-free continual learning.

Abstract

Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks, as old data from previous tasks is unavailable when learning a new task. To address this, some methods propose replaying data from previous tasks during new task learning, typically using extra memory to store replay data. However, it is not expected in practice due to memory constraints and data privacy issues. Instead, data-free replay methods invert samples from the classification model. While effective, these methods face inconsistencies between inverted and real training data, overlooked in recent works. To that effect, we propose to measure the data consistency quantitatively by some simplification and assumptions. Using this measurement, we gain insight to develop a novel loss function that reduces inconsistency. Specifically, the loss minimizes the KL divergence between distributions of inverted and real data under a tied multivariate Gaussian assumption, which is simple to implement in continual learning. Additionally, we observe that old class weight norms decrease continually as learning progresses. We analyze the reasons and propose a regularization term to balance class weights, making old class samples more distinguishable. To conclude, we introduce Consistency-enhanced data replay with a Debiased classifier for class incremental learning (CwD). Extensive experiments on CIFAR-100, Tiny-ImageNet, and ImageNet100 show consistently improved performance of CwD compared to previous approaches.

Enhancing Consistency and Mitigating Bias: A Data Replay Approach for Incremental Learning

TL;DR

This paper tackles catastrophic forgetting in class incremental learning under data privacy constraints by focusing on data-free replay through model inversion. It introduces CwD, a Consistency-enhanced data replay framework with a Debiased classifier that adds a data-consistency enhancement loss (DCE) to align inverted and real data distributions under a tied multivariate Gaussian assumption, and a weight alignment regularization (WAR) to balance class weights during training. An extra estimation stage collects old-task statistics to improve future inversions, forming a three-stage pipeline (Inversion, Training, Estimation). Across CIFAR-100, Tiny-ImageNet, and ImageNet-100, CwD consistently improves last-task accuracy and average performance over prior data-free baselines and can boost non-data-free baselines when combined. The work also provides thorough ablations, debiasing comparisons, and overhead analyses, highlighting practical gains and areas for further refinement in data-free continual learning.

Abstract

Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks, as old data from previous tasks is unavailable when learning a new task. To address this, some methods propose replaying data from previous tasks during new task learning, typically using extra memory to store replay data. However, it is not expected in practice due to memory constraints and data privacy issues. Instead, data-free replay methods invert samples from the classification model. While effective, these methods face inconsistencies between inverted and real training data, overlooked in recent works. To that effect, we propose to measure the data consistency quantitatively by some simplification and assumptions. Using this measurement, we gain insight to develop a novel loss function that reduces inconsistency. Specifically, the loss minimizes the KL divergence between distributions of inverted and real data under a tied multivariate Gaussian assumption, which is simple to implement in continual learning. Additionally, we observe that old class weight norms decrease continually as learning progresses. We analyze the reasons and propose a regularization term to balance class weights, making old class samples more distinguishable. To conclude, we introduce Consistency-enhanced data replay with a Debiased classifier for class incremental learning (CwD). Extensive experiments on CIFAR-100, Tiny-ImageNet, and ImageNet100 show consistently improved performance of CwD compared to previous approaches.
Paper Structure (25 sections, 18 equations, 7 figures, 7 tables, 1 algorithm)

This paper contains 25 sections, 18 equations, 7 figures, 7 tables, 1 algorithm.

Figures (7)

  • Figure 1: Schematic illustration of data consistency enhancement. Left, the situation of real samples. The distribution is estimated from real samples. Middle, the situation of inverted samples before data consistency enhancement. Right, the situation of inverted samples after consistency enhancement.
  • Figure 2: An overview of our proposed CwD framework. Inversion: when a new task comes, we first invert samples from the old model with the help of statistical parameters in the old task. Data consistency enhancement loss $L_{dce}$ is applied in this stage. Training: we use the inverted data and real new data to train a new model. During the training stage, we regularize the class weights to be unbiased by weight alignment regularization loss $L_{war}$. Estimation: when training is over, we estimate the statistical parameters of all classes by the new model.
  • Figure 3: The norms of class weights in the standard 5-task setting.
  • Figure 4: The norms of class weights in 5-task experiments with different debiasing approaches. (a) Asymmetric Cross-entropy. (b) Separated-Softmax for Incremental Learning. (c) Weight Aligning. (d) CwD with Weight Alignment Regularization.
  • Figure 5: The influence of $\lambda_{dce}$ and $\lambda_{war}$ on CIFAR-100 with 5, 10 and 20 tasks. Left, search $\lambda_{dce}$ when $\lambda_{war}$ is fixed. Right, search $\lambda_{war}$ when $\lambda_{dce}$ is fixed.
  • ...and 2 more figures