Table of Contents
Fetching ...

Big Self-Supervised Models Advance Medical Image Classification

Shekoofeh Azizi, Basil Mustafa, Fiona Ryan, Zachary Beaver, Jan Freyberg, Jonathan Deaton, Aaron Loh, Alan Karthikesalingam, Simon Kornblith, Ting Chen, Vivek Natarajan, Mohammad Norouzi

TL;DR

Big self-supervised models can improve medical image classification when labeled data are scarce. The authors show that pretraining on unlabeled data with contrastive learning, including a novel MICLe strategy that uses multiple images per case, followed by supervised fine‑tuning, yields higher accuracy and AUC than ImageNet‑pretrained baselines on dermatology and CheXpert X‑ray tasks. MICLe boosts performance notably in dermatology and provides robust generalization under distribution shifts, while also reducing label requirements. The work highlights the scalability and domain adaptation advantages of self‑supervised pretraining for medical imaging.

Abstract

Self-supervised pretraining followed by supervised fine-tuning has seen success in image recognition, especially when labeled examples are scarce, but has received limited attention in medical image analysis. This paper studies the effectiveness of self-supervised learning as a pretraining strategy for medical image classification. We conduct experiments on two distinct tasks: dermatology skin condition classification from digital camera images and multi-label chest X-ray classification, and demonstrate that self-supervised learning on ImageNet, followed by additional self-supervised learning on unlabeled domain-specific medical images significantly improves the accuracy of medical image classifiers. We introduce a novel Multi-Instance Contrastive Learning (MICLe) method that uses multiple images of the underlying pathology per patient case, when available, to construct more informative positive pairs for self-supervised learning. Combining our contributions, we achieve an improvement of 6.7% in top-1 accuracy and an improvement of 1.1% in mean AUC on dermatology and chest X-ray classification respectively, outperforming strong supervised baselines pretrained on ImageNet. In addition, we show that big self-supervised models are robust to distribution shift and can learn efficiently with a small number of labeled medical images.

Big Self-Supervised Models Advance Medical Image Classification

TL;DR

Big self-supervised models can improve medical image classification when labeled data are scarce. The authors show that pretraining on unlabeled data with contrastive learning, including a novel MICLe strategy that uses multiple images per case, followed by supervised fine‑tuning, yields higher accuracy and AUC than ImageNet‑pretrained baselines on dermatology and CheXpert X‑ray tasks. MICLe boosts performance notably in dermatology and provides robust generalization under distribution shifts, while also reducing label requirements. The work highlights the scalability and domain adaptation advantages of self‑supervised pretraining for medical imaging.

Abstract

Self-supervised pretraining followed by supervised fine-tuning has seen success in image recognition, especially when labeled examples are scarce, but has received limited attention in medical image analysis. This paper studies the effectiveness of self-supervised learning as a pretraining strategy for medical image classification. We conduct experiments on two distinct tasks: dermatology skin condition classification from digital camera images and multi-label chest X-ray classification, and demonstrate that self-supervised learning on ImageNet, followed by additional self-supervised learning on unlabeled domain-specific medical images significantly improves the accuracy of medical image classifiers. We introduce a novel Multi-Instance Contrastive Learning (MICLe) method that uses multiple images of the underlying pathology per patient case, when available, to construct more informative positive pairs for self-supervised learning. Combining our contributions, we achieve an improvement of 6.7% in top-1 accuracy and an improvement of 1.1% in mean AUC on dermatology and chest X-ray classification respectively, outperforming strong supervised baselines pretrained on ImageNet. In addition, we show that big self-supervised models are robust to distribution shift and can learn efficiently with a small number of labeled medical images.

Paper Structure

This paper contains 32 sections, 1 equation, 14 figures, 10 tables, 1 algorithm.

Figures (14)

  • Figure 1: Our approach comprises three steps: (1) Self-supervised pretraining on unlabeled ImageNet using SimCLR chen2020simple. (2) Additional self-supervised pretraining using unlabeled medical images. If multiple images of each medical condition are available, a novel Multi-Instance Contrastive Learning (MICLe) is used to construct more informative positive pairs based on different images. (3) Supervised fine-tuning on labeled medical images. Note that unlike step (1), steps (2) and (3) are task and dataset specific.
  • Figure 2: Comparison of supervised and self-supervised pretraining, followed by supervised fine-tuning using two architectures on dermatology and chest X-ray classification. Self-supervised learning utilizes unlabeled domain-specific medical images and significantly outperforms supervised ImageNet pretraining.
  • Figure 3: An illustrations of our self-supervised pretraining for medical image analysis. When a single image of a medical condition is available, we use standard data augmentation to generate two augmented views of the same image. When multiple images are available, we use two distinct images to directly create a positive pair of examples. We call the latter approach Multi-Instance Contrastive Learning (MICLe).
  • Figure 4: Evaluation of models on distribution-shifted datasets (left: $\mathcal{D}_{\text{Derm}}^{\text{Unlabeled}}$$\to$$\mathcal{D}_{\text{Derm}}^{\text{External}}$; right: $\mathcal{D}_{\text{CheXpert}}^{\text{Unlabeled}}$$\to$$\mathcal{D}_{\text{NIH}}^{}$) shows that self-supervised training using both ImageNet and the target domain significantly improves robustness to distribution shift.
  • Figure 5: Top-1 accuracy for dermatology condition classification for MICLe, SimCLR, and supervised models under different unlabeled pretraining dataset and varied sizes of label fractions.
  • ...and 9 more figures