DDA: Dimensionality Driven Augmentation Search for Contrastive Learning in Laparoscopic Surgery

Yuning Zhou; Henry Badgery; Matthew Read; James Bailey; Catherine E. Davey

DDA: Dimensionality Driven Augmentation Search for Contrastive Learning in Laparoscopic Surgery

Yuning Zhou, Henry Badgery, Matthew Read, James Bailey, Catherine E. Davey

TL;DR

The paper addresses domain-specific augmentation selection for contrastive self-supervised learning in medical imaging, focusing on laparoscopic surgery. It introduces Dimensionality Driven Augmentation Search ($DDA$), which differentiably optimizes augmentation policies by maximizing the local intrinsic dimensionality ($LID$) of representations, using a proxy objective that does not require downstream labels. Empirically, $DDA$ yields consistent improvements over standard SimCLR and SelfAugment baselines on linear evaluation and downstream segmentation across SVHM and Cholec80, and reveals that color-based augmentations like hue are not advantageous for laparoscopic imagery. The method is computationally efficient, navigating large policy spaces in hours and providing domain-relevant insights into effective augmentations for medical SSL.

Abstract

Self-supervised learning (SSL) has potential for effective representation learning in medical imaging, but the choice of data augmentation is critical and domain-specific. It remains uncertain if general augmentation policies suit surgical applications. In this work, we automate the search for suitable augmentation policies through a new method called Dimensionality Driven Augmentation Search (DDA). DDA leverages the local dimensionality of deep representations as a proxy target, and differentiably searches for suitable data augmentation policies in contrastive learning. We demonstrate the effectiveness and efficiency of DDA in navigating a large search space and successfully identifying an appropriate data augmentation policy for laparoscopic surgery. We systematically evaluate DDA across three laparoscopic image classification and segmentation tasks, where it significantly improves over existing baselines. Furthermore, DDA's optimised set of augmentations provides insight into domain-specific dependencies when applying contrastive learning in medical applications. For example, while hue is an effective augmentation for natural images, it is not advantageous for laparoscopic images.

DDA: Dimensionality Driven Augmentation Search for Contrastive Learning in Laparoscopic Surgery

TL;DR

), which differentiably optimizes augmentation policies by maximizing the local intrinsic dimensionality (

) of representations, using a proxy objective that does not require downstream labels. Empirically,

yields consistent improvements over standard SimCLR and SelfAugment baselines on linear evaluation and downstream segmentation across SVHM and Cholec80, and reveals that color-based augmentations like hue are not advantageous for laparoscopic imagery. The method is computationally efficient, navigating large policy spaces in hours and providing domain-relevant insights into effective augmentations for medical SSL.

Abstract

Paper Structure (18 sections, 1 theorem, 8 equations, 16 figures, 12 tables)

This paper contains 18 sections, 1 theorem, 8 equations, 16 figures, 12 tables.

Introduction
Method
Problem Definition
DDA Search Framework
DDA Search with Representation Dimensionality
Experiments
Evaluations
Analysis of the Augmentation Policy Found by DDA
Conclusion
DAA Algorithm
Experiments
Experiment Settings
Search Space of the Augmentation
Augmentation Policy found by DDA
Additional Results
...and 3 more sections

Key Result

theorem 1

If $F$ is continuously differentiable at $r$, then

Figures (16)

Figure 1: Illustration of differentiable augmentation policy design and an example application of the contrast operation (left), and the comparison of grid search and our DDA framework for contrastive learning (right).
Figure 2: (a-b) Linear probing accuracy on the Cholec80 Tool dataset with different numbers of augmentation operations ($N$). Each data point is an individual run of the experiment, from augmentation search to pretraining and evaluations. (c-d) Distributions of different augmentation operations found by our method. In subfigures (a) and (c), results are obtained by pretraining on our private SVHM dataset. In subfigures (b) and (d), results are obtained by pretraining on the public dataset Cholec80.
Figure 3: Illustration of DDA and SimCLR augmented images on SVHM dataset.
Figure 4: The figure shown on the left is a representation learned by training a SimCLR model with DDA. These representations are projected onto a 2D space using t-SNE. In this visualization, the radius, $r$, indicates the maximum distance from the query to the relevant neighbourhood, while $r1$ and $r2$ represent the distances from the query to the first (NN-1) and second (NN-2) nearest data points, respectively. The third nearest neighbour (NN-3) lies on the sphere at the same distance from the query as $r$.
Figure 5: Illustration of 10 augmentation operations (1)-(10) in our search space.
...and 11 more figures

Theorems & Definitions (1)

theorem 1: houle2017local1

DDA: Dimensionality Driven Augmentation Search for Contrastive Learning in Laparoscopic Surgery

TL;DR

Abstract

DDA: Dimensionality Driven Augmentation Search for Contrastive Learning in Laparoscopic Surgery

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (16)

Theorems & Definitions (1)