Table of Contents
Fetching ...

DeepSet SimCLR: Self-supervised deep sets for improved pathology representation learning

David Torpey, Richard Klein

TL;DR

The paper addresses the high computational barrier of self-supervised learning for 3D medical data by introducing two efficient variants of a 2D SSL baseline that implicitly account for 3D structure. Per-scan SimCLR adds intra-scan sampling to encourage cross-slice consistency, while DeepSet SimCLR aggregates a set of slices with a Deep Set encoder to maintain permutation invariance and reduce overhead. Across diverse downstream tasks, both variants outperform the baseline, with DS-SimCLR delivering particularly strong gains on several datasets, demonstrating improved pathology representation learning with minimal extra cost. This work broadens the accessibility of SSL in medical imaging by enabling 3D-aware learning using standard 2D architectures plus lightweight extensions.

Abstract

Often, applications of self-supervised learning to 3D medical data opt to use 3D variants of successful 2D network architectures. Although promising approaches, they are significantly more computationally demanding to train, and thus reduce the widespread applicability of these methods away from those with modest computational resources. Thus, in this paper, we aim to improve standard 2D SSL algorithms by modelling the inherent 3D nature of these datasets implicitly. We propose two variants that build upon a strong baseline model and show that both of these variants often outperform the baseline in a variety of downstream tasks. Importantly, in contrast to previous works in both 2D and 3D approaches for 3D medical data, both of our proposals introduce negligible additional overhead over the baseline, improving the democratisation of these approaches for medical applications.

DeepSet SimCLR: Self-supervised deep sets for improved pathology representation learning

TL;DR

The paper addresses the high computational barrier of self-supervised learning for 3D medical data by introducing two efficient variants of a 2D SSL baseline that implicitly account for 3D structure. Per-scan SimCLR adds intra-scan sampling to encourage cross-slice consistency, while DeepSet SimCLR aggregates a set of slices with a Deep Set encoder to maintain permutation invariance and reduce overhead. Across diverse downstream tasks, both variants outperform the baseline, with DS-SimCLR delivering particularly strong gains on several datasets, demonstrating improved pathology representation learning with minimal extra cost. This work broadens the accessibility of SSL in medical imaging by enabling 3D-aware learning using standard 2D architectures plus lightweight extensions.

Abstract

Often, applications of self-supervised learning to 3D medical data opt to use 3D variants of successful 2D network architectures. Although promising approaches, they are significantly more computationally demanding to train, and thus reduce the widespread applicability of these methods away from those with modest computational resources. Thus, in this paper, we aim to improve standard 2D SSL algorithms by modelling the inherent 3D nature of these datasets implicitly. We propose two variants that build upon a strong baseline model and show that both of these variants often outperform the baseline in a variety of downstream tasks. Importantly, in contrast to previous works in both 2D and 3D approaches for 3D medical data, both of our proposals introduce negligible additional overhead over the baseline, improving the democratisation of these approaches for medical applications.
Paper Structure (23 sections, 1 equation, 2 figures, 7 tables)

This paper contains 23 sections, 1 equation, 2 figures, 7 tables.

Figures (2)

  • Figure 1: DS-SimCLR architecture.
  • Figure 2: Visualisation of SemSeg fine-tuning results on the Kvasir Instruments dataset for 3 randomly chosen test images. The three columns are L-R: input image, ground-truth mask, and predicted mask.