Table of Contents
Fetching ...

QMix: Quality-aware Learning with Mixed Noise for Robust Retinal Disease Diagnosis

Junlin Hou, Jilan Xu, Rui Feng, Hao Chen

TL;DR

This work tackles robustness in retinal disease diagnosis under mixed noise, where both label noise and data/quality noise degrade performance. It introduces QMix, a two-branch co-training framework that alternates between sample separation—via a joint uncertainty–loss Gaussian Mixture Model to categorize samples as Correct, Mis-H, or Mis-L—and quality-aware semi-supervised training that refines labels, learns robust representations, and uses a contrastive enhancement to separate Mis-L. The method combines a novel sample-reweighing loss and a contrastive loss within SSL, plus a regularization term, to suppress the impact of Mis-L while preserving useful information from Correct and Mis-H. Extensive experiments on six retinal datasets show state-of-the-art results under synthetic and real-world mixed noise, with strong robustness to high noise ratios and clear improvements in sample separation quality, indicating practical value for medical image analysis in noisy acquisition settings.

Abstract

Due to the complexity of medical image acquisition and the difficulty of annotation, medical image datasets inevitably contain noise. Noisy data with wrong labels affects the robustness and generalization ability of deep neural networks. Previous noise learning methods mainly considered noise arising from images being mislabeled, i.e. label noise, assuming that all mislabeled images are of high image quality. However, medical images are prone to suffering extreme quality issues, i.e. data noise, where discriminative visual features are missing for disease diagnosis. In this paper, we propose a noise learning framework, termed as QMix, that learns a robust disease diagnosis model under mixed noise. QMix alternates between sample separation and quality-aware semisupervised training in each training epoch. In the sample separation phase, we design a joint uncertainty-loss criterion to effectively separate (1) correctly labeled images; (2) mislabeled images with high quality and (3) mislabeled images with low quality. In the semi-supervised training phase, we train a disease diagnosis model to learn robust feature representation from the separated samples. Specifically, we devise a sample-reweighing loss to mitigate the effect of mislabeled images with low quality during training. Meanwhile, a contrastive enhancement loss is proposed to further distinguish mislabeled images with low quality from correctly labeled images. QMix achieved state-of-the-art disease diagnosis performance on five public retinal image datasets and exhibited substantial improvement on robustness against mixed noise.

QMix: Quality-aware Learning with Mixed Noise for Robust Retinal Disease Diagnosis

TL;DR

This work tackles robustness in retinal disease diagnosis under mixed noise, where both label noise and data/quality noise degrade performance. It introduces QMix, a two-branch co-training framework that alternates between sample separation—via a joint uncertainty–loss Gaussian Mixture Model to categorize samples as Correct, Mis-H, or Mis-L—and quality-aware semi-supervised training that refines labels, learns robust representations, and uses a contrastive enhancement to separate Mis-L. The method combines a novel sample-reweighing loss and a contrastive loss within SSL, plus a regularization term, to suppress the impact of Mis-L while preserving useful information from Correct and Mis-H. Extensive experiments on six retinal datasets show state-of-the-art results under synthetic and real-world mixed noise, with strong robustness to high noise ratios and clear improvements in sample separation quality, indicating practical value for medical image analysis in noisy acquisition settings.

Abstract

Due to the complexity of medical image acquisition and the difficulty of annotation, medical image datasets inevitably contain noise. Noisy data with wrong labels affects the robustness and generalization ability of deep neural networks. Previous noise learning methods mainly considered noise arising from images being mislabeled, i.e. label noise, assuming that all mislabeled images are of high image quality. However, medical images are prone to suffering extreme quality issues, i.e. data noise, where discriminative visual features are missing for disease diagnosis. In this paper, we propose a noise learning framework, termed as QMix, that learns a robust disease diagnosis model under mixed noise. QMix alternates between sample separation and quality-aware semisupervised training in each training epoch. In the sample separation phase, we design a joint uncertainty-loss criterion to effectively separate (1) correctly labeled images; (2) mislabeled images with high quality and (3) mislabeled images with low quality. In the semi-supervised training phase, we train a disease diagnosis model to learn robust feature representation from the separated samples. Specifically, we devise a sample-reweighing loss to mitigate the effect of mislabeled images with low quality during training. Meanwhile, a contrastive enhancement loss is proposed to further distinguish mislabeled images with low quality from correctly labeled images. QMix achieved state-of-the-art disease diagnosis performance on five public retinal image datasets and exhibited substantial improvement on robustness against mixed noise.
Paper Structure (30 sections, 12 equations, 8 figures, 5 tables)

This paper contains 30 sections, 12 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: An illustration of correctly labeled data (Correct), mislabeled high-quality data (Mis-H), and mislabeled low-quality data (Mis-L).
  • Figure 2: Results of DR grading on the DDR dataset. Left: performance degradation (in kappa score) of current methods under mixed noise (green bar); Right: the baseline model, trained by the standard cross-entropy loss, failed to distinguish among Correct/Mis-H/Mis-L.
  • Figure 3: An overall framework of QMix for quality-aware learning with mixed noise. (a) A network co-training scheme that alternates between sample separation and quality-aware semi-supervised learning (SSL) training in each epoch. (b) Sample separation of Correct/Mis-H/Mis-L using the joint distribution of uncertainty and loss. (c) Quality-aware SSL training with a SSL loss ($\mathcal{L}_{SSL}$) and a contrastive enhancement loss ($\mathcal{L}_{con}$).
  • Figure 4: Memorization effect on (a) label noise only and (c) mixed noise; (b) previous small-loss separation; (d) our joint uncertainty-loss criterion.
  • Figure 5: Sample separation performance under symmetric noise on the DDR dataset. Top: Comparison of correct sample AUC. Bottom: Visualization of the joint uncertainty-loss distribution.
  • ...and 3 more figures