Table of Contents
Fetching ...

Feature Expansion and enhanced Compression for Class Incremental Learning

Quentin Ferdinand, Gilles Le Chenadec, Benoit Clement, Panagiotis Papadakis, Quentin Oliveau

TL;DR

This work tackles catastrophic forgetting in class incremental learning by proposing a two-stage framework that first expands the model's feature space to accommodate new classes and then compresses it back to its original size. The key innovation is Rehearsal-CutMix, a memory-augmented CutMix-based distillation augmentation that rebalances training data and strengthens knowledge transfer for past classes during compression. Across CIFAR-100 and ImageNet-100/1000, the method (FECIL) consistently outperforms state-of-the-art methods, achieving higher average and last-step accuracies while maintaining a compact model size. The approach offers a practical pathway for scalable continual learning with fixed model footprints and robust performance, validated by extensive ablations and overhead analyses.

Abstract

Class incremental learning consists in training discriminative models to classify an increasing number of classes over time. However, doing so using only the newly added class data leads to the known problem of catastrophic forgetting of the previous classes. Recently, dynamic deep learning architectures have been shown to exhibit a better stability-plasticity trade-off by dynamically adding new feature extractors to the model in order to learn new classes followed by a compression step to scale the model back to its original size, thus avoiding a growing number of parameters. In this context, we propose a new algorithm that enhances the compression of previous class knowledge by cutting and mixing patches of previous class samples with the new images during compression using our Rehearsal-CutMix method. We show that this new data augmentation reduces catastrophic forgetting by specifically targeting past class information and improving its compression. Extensive experiments performed on the CIFAR and ImageNet datasets under diverse incremental learning evaluation protocols demonstrate that our approach consistently outperforms the state-of-the-art . The code will be made available upon publication of our work.

Feature Expansion and enhanced Compression for Class Incremental Learning

TL;DR

This work tackles catastrophic forgetting in class incremental learning by proposing a two-stage framework that first expands the model's feature space to accommodate new classes and then compresses it back to its original size. The key innovation is Rehearsal-CutMix, a memory-augmented CutMix-based distillation augmentation that rebalances training data and strengthens knowledge transfer for past classes during compression. Across CIFAR-100 and ImageNet-100/1000, the method (FECIL) consistently outperforms state-of-the-art methods, achieving higher average and last-step accuracies while maintaining a compact model size. The approach offers a practical pathway for scalable continual learning with fixed model footprints and robust performance, validated by extensive ablations and overhead analyses.

Abstract

Class incremental learning consists in training discriminative models to classify an increasing number of classes over time. However, doing so using only the newly added class data leads to the known problem of catastrophic forgetting of the previous classes. Recently, dynamic deep learning architectures have been shown to exhibit a better stability-plasticity trade-off by dynamically adding new feature extractors to the model in order to learn new classes followed by a compression step to scale the model back to its original size, thus avoiding a growing number of parameters. In this context, we propose a new algorithm that enhances the compression of previous class knowledge by cutting and mixing patches of previous class samples with the new images during compression using our Rehearsal-CutMix method. We show that this new data augmentation reduces catastrophic forgetting by specifically targeting past class information and improving its compression. Extensive experiments performed on the CIFAR and ImageNet datasets under diverse incremental learning evaluation protocols demonstrate that our approach consistently outperforms the state-of-the-art . The code will be made available upon publication of our work.
Paper Structure (20 sections, 8 equations, 4 figures, 3 tables)

This paper contains 20 sections, 8 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Overview of the differences between Mixup zhang_mixup_2017, CutMix yun_cutmix_2019, and our Rehearsal-CutMix procedure. Our method specifically samples one image from the incremental dataset containing mostly new classes and one image from the rehearsal memory containing only previous classes before mixing them together for training.
  • Figure 2: Pipeline of the proposed approach. Each incremental step consists of two training phases, first the expansion phase where we dynamically expand $\Phi^{t-1}$ to learn new classes, and second, the phase where we compress the expanded model $\Phi^{t}_{big}$ back to its original size with minimal performance drop using our Rehearsal-CutMix distillation mechanism.
  • Figure 3: Performance evolution on CIFAR-100. The top-1 accuracy (%) is reported after each incremental step. Left is evaluated with 5 steps, middle with 10 steps, and right with 20 steps.
  • Figure 4: Evolution of the time necessary for an epoch during 10 incremental steps on the CIFAR-100 dataset. FOSTER_exp and FECIL_exp represent the time per epoch during the expansion phase while FOSTER_compress and FECIL_compress illustrate the time per epoch of the compression step.