Table of Contents
Fetching ...

Effect of Rotation Angle in Self-Supervised Pre-training is Dataset-Dependent

Amy Saranchuk, Michael Guerzhoy

TL;DR

This work examines how the rotation angle $\theta$ used in rotation-based contrastive self-supervised pre-training (SSP) influences learned features in medical imaging, showing that effects are highly dataset-dependent and exhibit periodic patterns. Using a MoCo v2 backbone (ResNet-50) pre-trained on ImageNet and fine-tuned on BraTS, Lung Mask, and Kvasir-SEG, the authors analyze learned features via SmoothGrad saliency maps and HoG descriptors, comparing them to ground-truth segmentations with the Dice coefficient. The key finding is that the relationship between $\theta$ and feature quality is not monotonic and varies across datasets, with oscillatory Dice scores and dataset-specific peak angles, challenging simple shortcut explanations. The work highlights the need for direct feature visualization and downstream-task evaluation to understand SSP's guarantees in data-limited medical imaging contexts and suggests directions for measuring learned representations more directly.

Abstract

Self-supervised learning for pre-training (SSP) can help the network learn better low-level features, especially when the size of the training set is small. In contrastive pre-training, the network is pre-trained to distinguish between different versions of the input. For example, the network learns to distinguish pairs (original, rotated) of images where the rotated image was rotated by angle $θ$ vs. other pairs of images. In this work, we show that, when training using contrastive pre-training in this way, the angle $θ$ and the dataset interact in interesting ways. We hypothesize, and give some evidence, that, for some datasets, the network can take "shortcuts" for particular rotation angles $θ$ based on the distribution of the gradient directions in the input, possibly avoiding learning features other than edges, but our experiments do not seem to support that hypothesis. We demonstrate experiments on three radiology datasets. We compute the saliency map indicating which pixels were important in the SSP process, and compare the saliency map to the ground truth foreground/background segmentation. Our visualizations indicate that the effects of rotation angles in SSP are dataset-dependent. We believe the distribution of gradient orientations may play a role in this, but our experiments so far are inconclusive.

Effect of Rotation Angle in Self-Supervised Pre-training is Dataset-Dependent

TL;DR

This work examines how the rotation angle used in rotation-based contrastive self-supervised pre-training (SSP) influences learned features in medical imaging, showing that effects are highly dataset-dependent and exhibit periodic patterns. Using a MoCo v2 backbone (ResNet-50) pre-trained on ImageNet and fine-tuned on BraTS, Lung Mask, and Kvasir-SEG, the authors analyze learned features via SmoothGrad saliency maps and HoG descriptors, comparing them to ground-truth segmentations with the Dice coefficient. The key finding is that the relationship between and feature quality is not monotonic and varies across datasets, with oscillatory Dice scores and dataset-specific peak angles, challenging simple shortcut explanations. The work highlights the need for direct feature visualization and downstream-task evaluation to understand SSP's guarantees in data-limited medical imaging contexts and suggests directions for measuring learned representations more directly.

Abstract

Self-supervised learning for pre-training (SSP) can help the network learn better low-level features, especially when the size of the training set is small. In contrastive pre-training, the network is pre-trained to distinguish between different versions of the input. For example, the network learns to distinguish pairs (original, rotated) of images where the rotated image was rotated by angle vs. other pairs of images. In this work, we show that, when training using contrastive pre-training in this way, the angle and the dataset interact in interesting ways. We hypothesize, and give some evidence, that, for some datasets, the network can take "shortcuts" for particular rotation angles based on the distribution of the gradient directions in the input, possibly avoiding learning features other than edges, but our experiments do not seem to support that hypothesis. We demonstrate experiments on three radiology datasets. We compute the saliency map indicating which pixels were important in the SSP process, and compare the saliency map to the ground truth foreground/background segmentation. Our visualizations indicate that the effects of rotation angles in SSP are dataset-dependent. We believe the distribution of gradient orientations may play a role in this, but our experiments so far are inconclusive.
Paper Structure (12 sections, 7 figures, 1 table)

This paper contains 12 sections, 7 figures, 1 table.

Figures (7)

  • Figure 1: Sample image and segmentation mask from the BraTS dataset
  • Figure 2: Sample image and segmentation mask from the Lung Mask Image dataset
  • Figure 3: Sample image and segmentation mask from the Kvasir-SEG dataset
  • Figure 4: Comparison of original images from Lung Mask Image dataset, segmentation masks, and saliency maps. Top row: Original orientation. Bottom row: Rotated by 95 $\degree$.
  • Figure 5: Comparison of original images from Kvasir-SEG dataset, segmentation masks, and saliency maps. Top row: Original orientation. Bottom row: Rotated by 95 $\degree$.
  • ...and 2 more figures