Table of Contents
Fetching ...

Intra-video Positive Pairs in Self-Supervised Learning for Ultrasound

Blake VanBerlo, Alexander Wong, Jesse Hoey, Robert Arntfield

TL;DR

Named Intra-Video Positive Pairs (IVPP), the method surpassed previous ultrasound-specific contrastive learning methods' average test accuracy on COVID-19 classification with the POCUS dataset by ≥ 1.3% and guidelines for practitioners were synthesized based on the results.

Abstract

Self-supervised learning (SSL) is one strategy for addressing the paucity of labelled data in medical imaging by learning representations from unlabelled images. Contrastive and non-contrastive SSL methods produce learned representations that are similar for pairs of related images. Such pairs are commonly constructed by randomly distorting the same image twice. The videographic nature of ultrasound offers flexibility for defining the similarity relationship between pairs of images. In this study, we investigated the effect of utilizing proximal, distinct images from the same B-mode ultrasound video as pairs for SSL. Additionally, we introduced a sample weighting scheme that increases the weight of closer image pairs and demonstrated how it can be integrated into SSL objectives. Named Intra-Video Positive Pairs (IVPP), the method surpassed previous ultrasound-specific contrastive learning methods' average test accuracy on COVID-19 classification with the POCUS dataset by $\ge 1.3\%$. Detailed investigations of IVPP's hyperparameters revealed that some combinations of IVPP hyperparameters can lead to improved or worsened performance, depending on the downstream task. Guidelines for practitioners were synthesized based on the results, such as the merit of IVPP with task-specific hyperparameters, and the improved performance of contrastive methods for ultrasound compared to non-contrastive counterparts.

Intra-video Positive Pairs in Self-Supervised Learning for Ultrasound

TL;DR

Named Intra-Video Positive Pairs (IVPP), the method surpassed previous ultrasound-specific contrastive learning methods' average test accuracy on COVID-19 classification with the POCUS dataset by ≥ 1.3% and guidelines for practitioners were synthesized based on the results.

Abstract

Self-supervised learning (SSL) is one strategy for addressing the paucity of labelled data in medical imaging by learning representations from unlabelled images. Contrastive and non-contrastive SSL methods produce learned representations that are similar for pairs of related images. Such pairs are commonly constructed by randomly distorting the same image twice. The videographic nature of ultrasound offers flexibility for defining the similarity relationship between pairs of images. In this study, we investigated the effect of utilizing proximal, distinct images from the same B-mode ultrasound video as pairs for SSL. Additionally, we introduced a sample weighting scheme that increases the weight of closer image pairs and demonstrated how it can be integrated into SSL objectives. Named Intra-Video Positive Pairs (IVPP), the method surpassed previous ultrasound-specific contrastive learning methods' average test accuracy on COVID-19 classification with the POCUS dataset by . Detailed investigations of IVPP's hyperparameters revealed that some combinations of IVPP hyperparameters can lead to improved or worsened performance, depending on the downstream task. Guidelines for practitioners were synthesized based on the results, such as the merit of IVPP with task-specific hyperparameters, and the improved performance of contrastive methods for ultrasound compared to non-contrastive counterparts.
Paper Structure (19 sections, 6 equations, 5 figures, 6 tables)

This paper contains 19 sections, 6 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: An overview of the methods introduced in this study. Positive pairs of images for self-supervised learning (SSL) with ultrasound are randomly sampled from the same B-mode video, such that the distance between them does not exceed a threshold. Each image is then subjected to stochastic data augmentation. As indicated by the darkly shaded fraction of the red bar between pairs, sample weights that are inversely proportional to the distance between eah pair may be integrated into the SSL objective.
  • Figure 2: Illustration of intra-video positive pairs. Positive pairs are considered images that are no more than a threshold apart from each other within the same ultrasound video.
  • Figure 3: Average test accuracy across $5$-fold cross validation on the POCUS dataset. Models were pretrained with a variety of intra-video positive pair thresholds with and without sample weights. Error bars indicate the standard deviation across folds. The dashed line indicates initialization with ImageNet-pretrained weights.
  • Figure 4: ParenchymalLUS test set AUC for the AB and LS binary classification tasks, calculated for models pretrained with a selection of contrastive and non-contrastive learning methods and employing a variety of intra-video positive pair thresholds with and without sample weights (SW). The dashed line indicates initialization with ImageNet-pretrained weights.
  • Figure 5: Boxplots conveying the quartile ranges of test AUC distributions for each pretraining method and assignment to $\delta$, with and without sample weights. Each experiment is repeated $20$ times on disjoint subsets of the training set, each containing all images from a group of patients.