Table of Contents
Fetching ...

Generalizable Single-Source Cross-modality Medical Image Segmentation via Invariant Causal Mechanisms

Boqi Chen, Yuanzhi Zhu, Yunke Ao, Sebastiano Caprara, Reto Sutter, Gunnar Rätsch, Ender Konukoglu, Anna Susmelj

TL;DR

This work combines causality-inspired theoretical insights on learning domain-invariant representations with recent advancements in diffusion-based augmentation to improve generalization across diverse imaging modalities in cross-modality medical image segmentation.

Abstract

Single-source domain generalization (SDG) aims to learn a model from a single source domain that can generalize well on unseen target domains. This is an important task in computer vision, particularly relevant to medical imaging where domain shifts are common. In this work, we consider a challenging yet practical setting: SDG for cross-modality medical image segmentation. We combine causality-inspired theoretical insights on learning domain-invariant representations with recent advancements in diffusion-based augmentation to improve generalization across diverse imaging modalities. Guided by the ``intervention-augmentation equivariant'' principle, we use controlled diffusion models (DMs) to simulate diverse imaging styles while preserving the content, leveraging rich generative priors in large-scale pretrained DMs to comprehensively perturb the multidimensional style variable. Extensive experiments on challenging cross-modality segmentation tasks demonstrate that our approach consistently outperforms state-of-the-art SDG methods across three distinct anatomies and imaging modalities. The source code is available at \href{https://github.com/ratschlab/ICMSeg}{https://github.com/ratschlab/ICMSeg}.

Generalizable Single-Source Cross-modality Medical Image Segmentation via Invariant Causal Mechanisms

TL;DR

This work combines causality-inspired theoretical insights on learning domain-invariant representations with recent advancements in diffusion-based augmentation to improve generalization across diverse imaging modalities in cross-modality medical image segmentation.

Abstract

Single-source domain generalization (SDG) aims to learn a model from a single source domain that can generalize well on unseen target domains. This is an important task in computer vision, particularly relevant to medical imaging where domain shifts are common. In this work, we consider a challenging yet practical setting: SDG for cross-modality medical image segmentation. We combine causality-inspired theoretical insights on learning domain-invariant representations with recent advancements in diffusion-based augmentation to improve generalization across diverse imaging modalities. Guided by the ``intervention-augmentation equivariant'' principle, we use controlled diffusion models (DMs) to simulate diverse imaging styles while preserving the content, leveraging rich generative priors in large-scale pretrained DMs to comprehensively perturb the multidimensional style variable. Extensive experiments on challenging cross-modality segmentation tasks demonstrate that our approach consistently outperforms state-of-the-art SDG methods across three distinct anatomies and imaging modalities. The source code is available at \href{https://github.com/ratschlab/ICMSeg}{https://github.com/ratschlab/ICMSeg}.

Paper Structure

This paper contains 25 sections, 11 equations, 13 figures, 7 tables.

Figures (13)

  • Figure 1: Overview of causal framework. (a) SCM for data generative process and examples of samples from the observational distribution. (b) Causal graph after do-intervention on style variables and examples of corresponding equivariant augmentations generated with the conditional diffusion model.
  • Figure 2: Overview of our method. (a) Fine-tune pretrained SD U-Net ($U^{B}$) on the source domain $D_0$ (left), and then train ControlNet with the fine-tuned SD U-Net ($U^{D_0}$) to inject image conditions (right), both using style-agnostic prompts. (b) Generate style-intervened images using ControlNet and $U^{B}$ with style-intervention prompts. The segmentation model is trained on pairs of original and style-intervened images using a segmentation loss and InfoNCE regularization.
  • Figure 3: Visualization of segmentation results on the task.. First two rows: "CT to MRI" task; Last two rows: "MRI to CT".
  • Figure 4: Visualization of segmentation results on the task. First row: "CT to MRI"; second row: "CT to X-Ray"; third row: "MRI to CT"; and last row: "MRI to X-Ray".
  • Figure 5: Visualization of segmentation results on the task. First two rows: "CT to X-Ray".
  • ...and 8 more figures

Theorems & Definitions (2)

  • Definition 1: arjovsky2019invariant
  • Definition 2: ilse2021selecting