Table of Contents
Fetching ...

Enhancing Contrastive Learning for Retinal Imaging via Adjusted Augmentation Scales

Zijie Cheng, Boxuan Li, André Altmann, Pearse A Keane, Yukun Zhou

TL;DR

Contrastive learning often underperforms on medical imaging due to dense latent distributions; the paper investigates this by adjusting augmentation scales. It evaluates a weak augmentation (Φ_weak) and a weak+medical augmentation (Φ_weak+med) using DINO pretraining on unlabeled retinal images across six datasets, with internal and external evaluations. Weak augmentation improves feature clustering and downstream AUROC/AUPR (e.g., on MESSIDOR-2 AUROC from $0.838$ to $0.848$ and AUPR from $0.523$ to $0.597$); strong augmentations and especially Φ_weak+med can degrade performance, while external tests show modest generalization gains. The study demonstrates that simple augmentation scale tuning is a practical lever to boost self-supervised learning efficacy in medical imaging, guiding future work beyond DINO.

Abstract

Contrastive learning, a prominent approach within self-supervised learning, has demonstrated significant effectiveness in developing generalizable models for various applications involving natural images. However, recent research indicates that these successes do not necessarily extend to the medical imaging domain. In this paper, we investigate the reasons for this suboptimal performance and hypothesize that the dense distribution of medical images poses challenges to the pretext tasks in contrastive learning, particularly in constructing positive and negative pairs. We explore model performance under different augmentation strategies and compare the results to those achieved with strong augmentations. Our study includes six publicly available datasets covering multiple clinically relevant tasks. We further assess the model's generalizability through external evaluations. The model pre-trained with weak augmentation outperforms those with strong augmentation, improving AUROC from 0.838 to 0.848 and AUPR from 0.523 to 0.597 on MESSIDOR2, and showing similar enhancements across other datasets. Our findings suggest that optimizing the scale of augmentation is critical for enhancing the efficacy of contrastive learning in medical imaging.

Enhancing Contrastive Learning for Retinal Imaging via Adjusted Augmentation Scales

TL;DR

Contrastive learning often underperforms on medical imaging due to dense latent distributions; the paper investigates this by adjusting augmentation scales. It evaluates a weak augmentation (Φ_weak) and a weak+medical augmentation (Φ_weak+med) using DINO pretraining on unlabeled retinal images across six datasets, with internal and external evaluations. Weak augmentation improves feature clustering and downstream AUROC/AUPR (e.g., on MESSIDOR-2 AUROC from to and AUPR from to ); strong augmentations and especially Φ_weak+med can degrade performance, while external tests show modest generalization gains. The study demonstrates that simple augmentation scale tuning is a practical lever to boost self-supervised learning efficacy in medical imaging, guiding future work beyond DINO.

Abstract

Contrastive learning, a prominent approach within self-supervised learning, has demonstrated significant effectiveness in developing generalizable models for various applications involving natural images. However, recent research indicates that these successes do not necessarily extend to the medical imaging domain. In this paper, we investigate the reasons for this suboptimal performance and hypothesize that the dense distribution of medical images poses challenges to the pretext tasks in contrastive learning, particularly in constructing positive and negative pairs. We explore model performance under different augmentation strategies and compare the results to those achieved with strong augmentations. Our study includes six publicly available datasets covering multiple clinically relevant tasks. We further assess the model's generalizability through external evaluations. The model pre-trained with weak augmentation outperforms those with strong augmentation, improving AUROC from 0.838 to 0.848 and AUPR from 0.523 to 0.597 on MESSIDOR2, and showing similar enhancements across other datasets. Our findings suggest that optimizing the scale of augmentation is critical for enhancing the efficacy of contrastive learning in medical imaging.
Paper Structure (11 sections, 3 equations, 2 figures, 4 tables)

This paper contains 11 sections, 3 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Figures (a) and (b) illustrate the distribution of distances between positive pairs and negative pairs in both natural and medical image domains. Figure (c) presents the project pipeline: unlabeled data is used to pre-train contrastive learning models while investigating various augmentation strategies. The blue dots and yellow dots indicate augmented images from different original images. The goal of this approach is to enhance feature clustering and improve the accuracy of retinal disease diagnosis.
  • Figure 2: We extract features using the DINO teacher model (encoder), pre-trained separately with strong and weak augmentations. First, we calculate the Euclidean distances between positive and negative pairs and compare their distance distributions in Figure (a). We also use a t-SNE map to visualize the feature clustering in Figure (b), where different colors represent augmented views from different images.