A Clinical-oriented Multi-level Contrastive Learning Method for Disease Diagnosis in Low-quality Medical Images
Qingshan Hou, Shuai Cheng, Peng Cao, Jinzhu Yang, Xiaoli Liu, Osmar R. Zaiane, Yih Chung Tham
TL;DR
This work introduces CoMCL, a clinical-oriented multi-level contrastive learning framework designed to diagnose diseases from low-quality medical images. It constructs multi-level positive/negative pairs using a pre-trained lesion detector and applies three contrastive losses to separate lesion information from quality artifacts, while a self-paced, adaptive hard-negative mining scheme selects informative samples during training. Evaluations on EyeQ and NIH ChestXray14 demonstrate that CoMCL outperforms multiple state-of-the-art baselines and maintains robustness as image quality deteriorates. The combination of multi-level pairing and dynamic hard negative mining yields higher-quality lesion embeddings and improved diagnostic performance, with clear implications for more reliable clinical decision support in resource-constrained settings.
Abstract
Representation learning offers a conduit to elucidate distinctive features within the latent space and interpret the deep models. However, the randomness of lesion distribution and the complexity of low-quality factors in medical images pose great challenges for models to extract key lesion features. Disease diagnosis methods guided by contrastive learning (CL) have shown significant advantages in lesion feature representation. Nevertheless, the effectiveness of CL is highly dependent on the quality of the positive and negative sample pairs. In this work, we propose a clinical-oriented multi-level CL framework that aims to enhance the model's capacity to extract lesion features and discriminate between lesion and low-quality factors, thereby enabling more accurate disease diagnosis from low-quality medical images. Specifically, we first construct multi-level positive and negative pairs to enhance the model's comprehensive recognition capability of lesion features by integrating information from different levels and qualities of medical images. Moreover, to improve the quality of the learned lesion embeddings, we introduce a dynamic hard sample mining method based on self-paced learning. The proposed CL framework is validated on two public medical image datasets, EyeQ and Chest X-ray, demonstrating superior performance compared to other state-of-the-art disease diagnostic methods.
