Table of Contents
Fetching ...

CCSI: Continual Class-Specific Impression for Data-free Class Incremental Learning

Sana Ayromlou, Teresa Tsang, Purang Abolmaesumi, Xiaoxiao Li

TL;DR

This paper tackles data-free continual class incremental learning in medical imaging by introducing Continual Class-Specific Impression (CCSI), which synthesizes prior-class data via model inversion of a frozen classifier and mean-image initialization, guided by Continual Normalization (CN) statistics. A two-step pipeline first generates synthetic class impressions and then updates the model with new class data using three novel losses—intra-domain contrastive loss, margin loss, and cosine-normalized cross-entropy—along with a distillation term to mitigate forgetting. The approach yields substantial improvements over data-free baselines on MedMNIST benchmarks and a real Heart Echo dataset, with gains up to 51% in final-task accuracy and competitive performance relative to memory-based methods. The work highlights CN’s role in stabilizing data synthesis and continual training, offering a privacy-friendly avenue for deploying lifelong learning in clinical settings where storing prior patient data is restricted.

Abstract

In real-world clinical settings, traditional deep learning-based classification methods struggle with diagnosing newly introduced disease types because they require samples from all disease classes for offline training. Class incremental learning offers a promising solution by adapting a deep network trained on specific disease classes to handle new diseases. However, catastrophic forgetting occurs, decreasing the performance of earlier classes when adapting the model to new data. Prior proposed methodologies to overcome this require perpetual storage of previous samples, posing potential practical concerns regarding privacy and storage regulations in healthcare. To this end, we propose a novel data-free class incremental learning framework that utilizes data synthesis on learned classes instead of data storage from previous classes. Our key contributions include acquiring synthetic data known as Continual Class-Specific Impression (CCSI) for previously inaccessible trained classes and presenting a methodology to effectively utilize this data for updating networks when introducing new classes. We obtain CCSI by employing data inversion over gradients of the trained classification model on previous classes starting from the mean image of each class inspired by common landmarks shared among medical images and utilizing continual normalization layers statistics as a regularizer in this pixel-wise optimization process. Subsequently, we update the network by combining the synthesized data with new class data and incorporate several losses, including an intra-domain contrastive loss to generalize the deep network trained on the synthesized data to real data, a margin loss to increase separation among previous classes and new ones, and a cosine-normalized cross-entropy loss to alleviate the adverse effects of imbalanced distributions in training data.

CCSI: Continual Class-Specific Impression for Data-free Class Incremental Learning

TL;DR

This paper tackles data-free continual class incremental learning in medical imaging by introducing Continual Class-Specific Impression (CCSI), which synthesizes prior-class data via model inversion of a frozen classifier and mean-image initialization, guided by Continual Normalization (CN) statistics. A two-step pipeline first generates synthetic class impressions and then updates the model with new class data using three novel losses—intra-domain contrastive loss, margin loss, and cosine-normalized cross-entropy—along with a distillation term to mitigate forgetting. The approach yields substantial improvements over data-free baselines on MedMNIST benchmarks and a real Heart Echo dataset, with gains up to 51% in final-task accuracy and competitive performance relative to memory-based methods. The work highlights CN’s role in stabilizing data synthesis and continual training, offering a privacy-friendly avenue for deploying lifelong learning in clinical settings where storing prior patient data is restricted.

Abstract

In real-world clinical settings, traditional deep learning-based classification methods struggle with diagnosing newly introduced disease types because they require samples from all disease classes for offline training. Class incremental learning offers a promising solution by adapting a deep network trained on specific disease classes to handle new diseases. However, catastrophic forgetting occurs, decreasing the performance of earlier classes when adapting the model to new data. Prior proposed methodologies to overcome this require perpetual storage of previous samples, posing potential practical concerns regarding privacy and storage regulations in healthcare. To this end, we propose a novel data-free class incremental learning framework that utilizes data synthesis on learned classes instead of data storage from previous classes. Our key contributions include acquiring synthetic data known as Continual Class-Specific Impression (CCSI) for previously inaccessible trained classes and presenting a methodology to effectively utilize this data for updating networks when introducing new classes. We obtain CCSI by employing data inversion over gradients of the trained classification model on previous classes starting from the mean image of each class inspired by common landmarks shared among medical images and utilizing continual normalization layers statistics as a regularizer in this pixel-wise optimization process. Subsequently, we update the network by combining the synthesized data with new class data and incorporate several losses, including an intra-domain contrastive loss to generalize the deep network trained on the synthesized data to real data, a margin loss to increase separation among previous classes and new ones, and a cosine-normalized cross-entropy loss to alleviate the adverse effects of imbalanced distributions in training data.
Paper Structure (32 sections, 14 equations, 6 figures, 5 tables)

This paper contains 32 sections, 14 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Representation of data-free class incremental learning. $f^*_{i-1}$ is the model trained on previous data, while $f_{i}$ is the updated model with new classes. This approach enables the incremental learning of new classes added to a previously trained model without having access to previous data. We propose to tackle this problem by synthesizing samples of previous classes as the continual class-specific impression and adding them to the continual training paradigm. Best viewed in coloured print.
  • Figure 2: The class incremental learning pipeline of CCSI. Two main steps of CCSI contain: 1) Continual class-specific data synthesis (Sec. \ref{['part1']}): Initialize a batch of images with the mean of each class to synthesize images using a frozen model trained on the previous task, $f_{i-1}^{*}$. Update the batch by back-propagating with Eq. \ref{['eq:CIloss']} and using the statistics saved in the CN as a regularization term (Eq. \ref{['eq:r_cn']}); 2) Model update on new tasks (Sec. \ref{['part2']}): Leverage information from the previous model using the distillation loss. To prevent catastrophic forgetting of past tasks, we mitigate domain shift between synthesized and original data with a novel intra-domain conservative (IdC) loss (Sec. \ref{['IdC']}), a semi-supervised domain adaptation technique and encourage robust decision boundaries and overcome data imbalance with the margin loss (Sec. \ref{['Margin']}) and cosine-normalized cross-entropy (CN-CE) loss (Sec. \ref{['CN-CE']}). Best viewed in coloured print.
  • Figure 3: The effect of each loss on the model's latent space. $f^*_{i-1}$ is the model trained on previous data, and $f_{i}$ is the updated model with new classes. (a) The intra-domain contrastive loss reduces the domain shift by minimizing the distance between the synthesized and test data of the same class and maximizing the distance between synthesized data from different classes. (b) The margin loss enforces the separation between the latent representation of the new class and the previously trained classes. (c) The cosine-normalized cross-entropy loss balances the importance of the new class against the previously trained classes in the latent space to achieve clear class boundaries. Best viewed in coloured print.
  • Figure 4: Datasets' samples. Each dataset's first row shows samples from two different classes, the second row is the mean initialization of the respective class, and the third row is the synthesized images. Best view in coloured.
  • Figure 5: Testing accuracies on all tasks compared with state-of-the-art class-incremental learning. Dashed lines represent non-data-free methods, while straight lines represent data-free methods. We outperform all data-free methods on all datasets except TissueMNIST. While we surpass some non-data-free methods, we achieve comparable results with others.
  • ...and 1 more figures