Table of Contents
Fetching ...

From Uncertainty to Clarity: Uncertainty-Guided Class-Incremental Learning for Limited Biomedical Samples via Semantic Expansion

Yifei Yao, Hanrong Zhang

TL;DR

This work proposes a novel cumulative entropy prediction module to measure the uncertainty of the samples, of which the most uncertain samples are stored in a memory bank as exemplars for the model's later review, and develops a fine-grained semantic expansion module through various augmentations, leading to more compact distributions within the feature space and creating sufficient room for generalization to new classes.

Abstract

In real-world clinical settings, data distributions evolve over time, with a continuous influx of new, limited disease cases. Therefore, class incremental learning is of great significance, i.e., deep learning models are required to learn new class knowledge while maintaining accurate recognition of previous diseases. However, traditional deep neural networks often suffer from severe forgetting of prior knowledge when adapting to new data unless trained from scratch, which undesirably costs much time and computational burden. Additionally, the sample sizes for different diseases can be highly imbalanced, with newly emerging diseases typically having much fewer instances, consequently causing the classification bias. To tackle these challenges, we are the first to propose a class-incremental learning method under limited samples in the biomedical field. First, we propose a novel cumulative entropy prediction module to measure the uncertainty of the samples, of which the most uncertain samples are stored in a memory bank as exemplars for the model's later review. Furthermore, we theoretically demonstrate its effectiveness in measuring uncertainty. Second, we developed a fine-grained semantic expansion module through various augmentations, leading to more compact distributions within the feature space and creating sufficient room for generalization to new classes. Besides, a cosine classifier is utilized to mitigate classification bias caused by imbalanced datasets. Across four imbalanced data distributions over two datasets, our method achieves optimal performance, surpassing state-of-the-art methods by as much as 53.54% in accuracy.

From Uncertainty to Clarity: Uncertainty-Guided Class-Incremental Learning for Limited Biomedical Samples via Semantic Expansion

TL;DR

This work proposes a novel cumulative entropy prediction module to measure the uncertainty of the samples, of which the most uncertain samples are stored in a memory bank as exemplars for the model's later review, and develops a fine-grained semantic expansion module through various augmentations, leading to more compact distributions within the feature space and creating sufficient room for generalization to new classes.

Abstract

In real-world clinical settings, data distributions evolve over time, with a continuous influx of new, limited disease cases. Therefore, class incremental learning is of great significance, i.e., deep learning models are required to learn new class knowledge while maintaining accurate recognition of previous diseases. However, traditional deep neural networks often suffer from severe forgetting of prior knowledge when adapting to new data unless trained from scratch, which undesirably costs much time and computational burden. Additionally, the sample sizes for different diseases can be highly imbalanced, with newly emerging diseases typically having much fewer instances, consequently causing the classification bias. To tackle these challenges, we are the first to propose a class-incremental learning method under limited samples in the biomedical field. First, we propose a novel cumulative entropy prediction module to measure the uncertainty of the samples, of which the most uncertain samples are stored in a memory bank as exemplars for the model's later review. Furthermore, we theoretically demonstrate its effectiveness in measuring uncertainty. Second, we developed a fine-grained semantic expansion module through various augmentations, leading to more compact distributions within the feature space and creating sufficient room for generalization to new classes. Besides, a cosine classifier is utilized to mitigate classification bias caused by imbalanced datasets. Across four imbalanced data distributions over two datasets, our method achieves optimal performance, surpassing state-of-the-art methods by as much as 53.54% in accuracy.
Paper Structure (31 sections, 32 equations, 13 figures, 3 tables)

This paper contains 31 sections, 32 equations, 13 figures, 3 tables.

Figures (13)

  • Figure 1: A class-incremental learning process, where a classification model initially trained on a large set of images from three tissue types—ADI (adipose tissue), BACK (background), and DEB (debris)—is gradually updated by incorporating a small number of images from other types, such as LYM (lymphocytes), with corresponding updates to the model's embedding space.
  • Figure 2: The accuracy of the base session and the final session when applying state-of-the-art methods from CV to the biomedical dataset BloodMNIST.
  • Figure 3: Overview of our proposed ESSENTIAL. Target model includes a classification and supervised contrastive task, with shared backbone for the classification and contrastive query networks. The contrastive key network is updated via momentum. A cosine classifier distinguishes between classes and semantic expansions. Supervised contrastive loss and classification loss jointly update the model. Besides, The memory bank is updated by an additional module "Uncertainty Trajectory Analyzer", selecting the most uncertain samples as exemplars.
  • Figure 4: The figure depicts a binary classification task with only 5 epochs. Sample 1 and Sample 2 represent two typical cases from a real training process, showing the probability distribution for each class at every epoch. Unlike the static approach of calculating entropy using only the final epoch’s probability distribution, we use cumulative entropy over the entire training period to assess the uncertainty of the samples.
  • Figure 5: Architecture of Cumulative Entropy Prediction Module. Multi-layer features are extracted from each intermediate layer of the target model and used as inputs to the prediction model. These features are dimensionally reduced and concatenated before being passed through the next fully-connected layer, which outputs the predicted average cumulative entropy. The predicted average cumulative entropy of all samples within a class is then sorted, and the top samples are selected for re-evaluation by the target model to compute their true average cumulative entropy. Finally, the prediction model is updated using the Jensen-Shannon divergence between the predicted and true entropies, along with the cross-entropy loss of the target model.
  • ...and 8 more figures