Table of Contents
Fetching ...

Pixel-level Counterfactual Contrastive Learning for Medical Image Segmentation

Marceau Lafargue-Hauret, Raghav Mehta, Fabio De Sousa Ribeiro, Mélanie Roschewitz, Ben Glocker

Abstract

Image segmentation relies on large annotated datasets, which are expensive and slow to produce. Silver-standard (AI-generated) labels are easier to obtain, but they risk introducing bias. Self-supervised learning, needing only images, has become key for pre-training. Recent work combining contrastive learning with counterfactual generation improves representation learning for classification but does not readily extend to pixel-level tasks. We propose a pipeline combining counterfactual generation with dense contrastive learning via Dual-View (DVD-CL) and Multi-View (MVD-CL) methods, along with supervised variants that utilize available silver-standard annotations. A new visualisation algorithm, the Color-coded High Resolution Overlay map (CHRO-map) is also introduced. Experiments show annotation-free DVD-CL outperforms other dense contrastive learning methods, while supervised variants using silver-standard labels outperform training on the silver-standard labeled data directly, achieving $\sim$94% DSC on challenging data. These results highlight that pixel-level contrastive learning, enhanced by counterfactuals and silver-standard annotations, improves robustness to acquisition and pathological variations.

Pixel-level Counterfactual Contrastive Learning for Medical Image Segmentation

Abstract

Image segmentation relies on large annotated datasets, which are expensive and slow to produce. Silver-standard (AI-generated) labels are easier to obtain, but they risk introducing bias. Self-supervised learning, needing only images, has become key for pre-training. Recent work combining contrastive learning with counterfactual generation improves representation learning for classification but does not readily extend to pixel-level tasks. We propose a pipeline combining counterfactual generation with dense contrastive learning via Dual-View (DVD-CL) and Multi-View (MVD-CL) methods, along with supervised variants that utilize available silver-standard annotations. A new visualisation algorithm, the Color-coded High Resolution Overlay map (CHRO-map) is also introduced. Experiments show annotation-free DVD-CL outperforms other dense contrastive learning methods, while supervised variants using silver-standard labels outperform training on the silver-standard labeled data directly, achieving 94% DSC on challenging data. These results highlight that pixel-level contrastive learning, enhanced by counterfactuals and silver-standard annotations, improves robustness to acquisition and pathological variations.
Paper Structure (12 sections, 3 figures, 2 tables)

This paper contains 12 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Overview of the proposed methods. Views are formed through scanner (SC) and pleural effusion (PE) counterfactuals, in combination with a traditional augmentation pipeline. (S-)DVD-CL computes three dense similarity computations between the anchor view and each of the target views, and average results. (S-)MVD-CL computes similarity between all views at once.
  • Figure 2: Output CHRO-maps of the four different methods. Similar colors indicate similar encodings. We observe that MVD-CL fails to capture meaningful representations, whereas DVD-CL effectively encodes pixels based on their relative position to the spine, which can be further fine-tuned for segmentation. S-DVD-CL and S-MVD-CL manage to sharply distinguish both lungs.
  • Figure 3: Qualitative results for lung segmentation using various pre-training methods on PE image. Supervised methods (top row) show better performance compared to unsupervised methods (bottom row). We also observe that the proposed methods outperform their counterpart baselines.