Table of Contents
Fetching ...

CURVETE: Curriculum Learning and Progressive Self-supervised Training for Medical Image Classification

Asmaa Abbas, Mohamed Gaber, Mohammed M. Abdelsamea

TL;DR

CURVETE tackles limited annotated data and irregular class distributions in medical image classification by unifying self-supervised pretext learning with curriculum learning and anti-contrastive strategies. It uses a convolutional autoencoder to extract latent features from unlabelled data, then builds multi-level granularity pseudo-labels via $k$-means clustering on latent spaces $\{g_k, g_{k-1}, \dots, g_1\}$ to guide pretext training, with backbones like $ResNet$-50 or $DenseNet$-121. Downstream training employs anti-CL with class decomposition (e.g., $k=5$) and repeats training to ensure robust fine-tuning, starting from the most granular classes and gradually returning to the original labels. Experimental results on brain tumour, knee x-ray, and Mini-DDSM datasets show CURVETE achieving state-of-the-art performance across backbones, with statistically significant improvements over traditional transfer learning, CLOG-CD, and other SSL methods, demonstrating its practical value for medical imaging with limited labeled data.

Abstract

Identifying high-quality and easily accessible annotated samples poses a notable challenge in medical image analysis. Transfer learning techniques, leveraging pre-training data, offer a flexible solution to this issue. However, the impact of fine-tuning diminishes when the dataset exhibits an irregular distribution between classes. This paper introduces a novel deep convolutional neural network, named Curriculum Learning and Progressive Self-supervised Training (CURVETE). CURVETE addresses challenges related to limited samples, enhances model generalisability, and improves overall classification performance. It achieves this by employing a curriculum learning strategy based on the granularity of sample decomposition during the training of generic unlabelled samples. Moreover, CURVETE address the challenge of irregular class distribution by incorporating a class decomposition approach in the downstream task. The proposed method undergoes evaluation on three distinct medical image datasets: brain tumour, digital knee x-ray, and Mini-DDSM datasets. We investigate the classification performance using a generic self-supervised sample decomposition approach with and without the curriculum learning component in training the pretext task. Experimental results demonstrate that the CURVETE model achieves superior performance on test sets with an accuracy of 96.60% on the brain tumour dataset, 75.60% on the digital knee x-ray dataset, and 93.35% on the Mini-DDSM dataset using the baseline ResNet-50. Furthermore, with the baseline DenseNet-121, it achieved accuracies of 95.77%, 80.36%, and 93.22% on the brain tumour, digital knee x-ray, and Mini-DDSM datasets, respectively, outperforming other training strategies.

CURVETE: Curriculum Learning and Progressive Self-supervised Training for Medical Image Classification

TL;DR

CURVETE tackles limited annotated data and irregular class distributions in medical image classification by unifying self-supervised pretext learning with curriculum learning and anti-contrastive strategies. It uses a convolutional autoencoder to extract latent features from unlabelled data, then builds multi-level granularity pseudo-labels via -means clustering on latent spaces to guide pretext training, with backbones like -50 or -121. Downstream training employs anti-CL with class decomposition (e.g., ) and repeats training to ensure robust fine-tuning, starting from the most granular classes and gradually returning to the original labels. Experimental results on brain tumour, knee x-ray, and Mini-DDSM datasets show CURVETE achieving state-of-the-art performance across backbones, with statistically significant improvements over traditional transfer learning, CLOG-CD, and other SSL methods, demonstrating its practical value for medical imaging with limited labeled data.

Abstract

Identifying high-quality and easily accessible annotated samples poses a notable challenge in medical image analysis. Transfer learning techniques, leveraging pre-training data, offer a flexible solution to this issue. However, the impact of fine-tuning diminishes when the dataset exhibits an irregular distribution between classes. This paper introduces a novel deep convolutional neural network, named Curriculum Learning and Progressive Self-supervised Training (CURVETE). CURVETE addresses challenges related to limited samples, enhances model generalisability, and improves overall classification performance. It achieves this by employing a curriculum learning strategy based on the granularity of sample decomposition during the training of generic unlabelled samples. Moreover, CURVETE address the challenge of irregular class distribution by incorporating a class decomposition approach in the downstream task. The proposed method undergoes evaluation on three distinct medical image datasets: brain tumour, digital knee x-ray, and Mini-DDSM datasets. We investigate the classification performance using a generic self-supervised sample decomposition approach with and without the curriculum learning component in training the pretext task. Experimental results demonstrate that the CURVETE model achieves superior performance on test sets with an accuracy of 96.60% on the brain tumour dataset, 75.60% on the digital knee x-ray dataset, and 93.35% on the Mini-DDSM dataset using the baseline ResNet-50. Furthermore, with the baseline DenseNet-121, it achieved accuracies of 95.77%, 80.36%, and 93.22% on the brain tumour, digital knee x-ray, and Mini-DDSM datasets, respectively, outperforming other training strategies.

Paper Structure

This paper contains 14 sections, 3 equations, 2 figures, 6 tables.

Figures (2)

  • Figure 1: An example illustrates the granularity of the class decomposition process: a) the original dataset with two classes, $A$ and $B$; b) the new datasets generated after applying class decomposition granularity for $k$=4. At $g_{4}$, each class is divided into four sub-classes. Similarly, the granularity at each subsequent level $g_{3}$ and $g_{2}$, and finally $g_{1}$ corresponds to the original classes without any decomposition.
  • Figure 2: The framework of the CURVETE model, where $g_{c}$ refers to the maximum number of decomposition granularities of the classes.