How does self-supervised pretraining improve robustness against noisy labels across various medical image classification datasets?
Bidur Khanal, Binod Bhattarai, Bishesh Khanal, Cristian Linte
TL;DR
This work addresses robustness of medical image classification under label noise by proposing a two-stage pipeline: self-supervised pretraining followed by supervised learning with learning-with-noisy-label (LNL) techniques. It comprehensively evaluates eight self-supervised methods and two LNL strategies across five diverse datasets, introducing dataset-specific difficulty ranking via Fisher's CSS and systematic class-grouping. The key finding is that contrastive self-supervised pretraining most consistently enhances robustness to noisy labels, with DermNet being the most challenging yet displaying notable noise resilience; gains are larger under symmetrical noise and diminish under class-dependent noise. The results offer practical guidance on selecting SSL objectives and demonstrate how dataset properties influence robustness, informing deployment of SSL-LNL approaches in medical imaging.
Abstract
Noisy labels can significantly impact medical image classification, particularly in deep learning, by corrupting learned features. Self-supervised pretraining, which doesn't rely on labeled data, can enhance robustness against noisy labels. However, this robustness varies based on factors like the number of classes, dataset complexity, and training size. In medical images, subtle inter-class differences and modality-specific characteristics add complexity. Previous research hasn't comprehensively explored the interplay between self-supervised learning and robustness against noisy labels in medical image classification, considering all these factors. In this study, we address three key questions: i) How does label noise impact various medical image classification datasets? ii) Which types of medical image datasets are more challenging to learn and more affected by label noise? iii) How do different self-supervised pretraining methods enhance robustness across various medical image datasets? Our results show that DermNet, among five datasets (Fetal plane, DermNet, COVID-DU-Ex, MURA, NCT-CRC-HE-100K), is the most challenging but exhibits greater robustness against noisy labels. Additionally, contrastive learning stands out among the eight self-supervised methods as the most effective approach to enhance robustness against noisy labels.
