Table of Contents
Fetching ...

Semantic-aware Random Convolution and Source Matching for Domain Generalization in Medical Image Segmentation

Franz Thaler, Martin Urschler, Mateusz Kozinski, Matthias AF Gsell, Gernot Plank, Darko Stern

TL;DR

SRCSM tackles single-source domain generalization for medical image segmentation by combining semantic-aware random convolution (SRC) during training with test-time source matching (SM) to align target images to the source distribution. SRC uses region-specific, label-conditioned augmentations to capture modality-driven appearance changes, while SM remaps target intensities via histogram-based quantile mapping. Across abdominal, whole-heart, and prostate data, SRCSM achieves state-of-the-art cross-modality and cross-center generalization, often approaching or matching in-domain performance and demonstrating robustness to large morphologic shifts such as cine MR. The method is validated through extensive experiments, including a multi-domain, multi-label cardiac evaluation and a cine-transfer study, supported by rigorous statistical analysis. Overall, SRCSM offers a practical, training-efficient pathway to high-quality segmentation in unseen domains without target-domain data or test-time optimization.

Abstract

We tackle the challenging problem of single-source domain generalization (DG) for medical image segmentation. To this end, we aim for training a network on one domain (e.g., CT) and directly apply it to a different domain (e.g., MR) without adapting the model and without requiring images or annotations from the new domain during training. We propose a novel method for promoting DG when training deep segmentation networks, which we call SRCSM. During training, our method diversifies the source domain through semantic-aware random convolution, where different regions of a source image are augmented differently, based on their annotation labels. At test-time, we complement the randomization of the training domain via mapping the intensity of target domain images, making them similar to source domain data. We perform a comprehensive evaluation on a variety of cross-modality and cross-center generalization settings for abdominal, whole-heart and prostate segmentation, where we outperform previous DG techniques in a vast majority of experiments. Additionally, we also investigate our method when training on whole-heart CT or MR data and testing on the diastolic and systolic phase of cine MR data captured with different scanner hardware, where we make a step towards closing the domain gap in this even more challenging setting. Overall, our evaluation shows that SRCSM can be considered a new state-of-the-art in DG for medical image segmentation and, moreover, even achieves a segmentation performance that matches the performance of the in-domain baseline in several settings.

Semantic-aware Random Convolution and Source Matching for Domain Generalization in Medical Image Segmentation

TL;DR

SRCSM tackles single-source domain generalization for medical image segmentation by combining semantic-aware random convolution (SRC) during training with test-time source matching (SM) to align target images to the source distribution. SRC uses region-specific, label-conditioned augmentations to capture modality-driven appearance changes, while SM remaps target intensities via histogram-based quantile mapping. Across abdominal, whole-heart, and prostate data, SRCSM achieves state-of-the-art cross-modality and cross-center generalization, often approaching or matching in-domain performance and demonstrating robustness to large morphologic shifts such as cine MR. The method is validated through extensive experiments, including a multi-domain, multi-label cardiac evaluation and a cine-transfer study, supported by rigorous statistical analysis. Overall, SRCSM offers a practical, training-efficient pathway to high-quality segmentation in unseen domains without target-domain data or test-time optimization.

Abstract

We tackle the challenging problem of single-source domain generalization (DG) for medical image segmentation. To this end, we aim for training a network on one domain (e.g., CT) and directly apply it to a different domain (e.g., MR) without adapting the model and without requiring images or annotations from the new domain during training. We propose a novel method for promoting DG when training deep segmentation networks, which we call SRCSM. During training, our method diversifies the source domain through semantic-aware random convolution, where different regions of a source image are augmented differently, based on their annotation labels. At test-time, we complement the randomization of the training domain via mapping the intensity of target domain images, making them similar to source domain data. We perform a comprehensive evaluation on a variety of cross-modality and cross-center generalization settings for abdominal, whole-heart and prostate segmentation, where we outperform previous DG techniques in a vast majority of experiments. Additionally, we also investigate our method when training on whole-heart CT or MR data and testing on the diastolic and systolic phase of cine MR data captured with different scanner hardware, where we make a step towards closing the domain gap in this even more challenging setting. Overall, our evaluation shows that SRCSM can be considered a new state-of-the-art in DG for medical image segmentation and, moreover, even achieves a segmentation performance that matches the performance of the in-domain baseline in several settings.

Paper Structure

This paper contains 20 sections, 5 equations, 7 figures, 10 tables.

Figures (7)

  • Figure 1: We propose SRCSM, a single-source cross-modality domain generalization approach that (1) during training, expands the source data distribution ($\mathcal{S}$) through our Semantic-aware Random Convolution (SRC), and (2) at test-time, shifts images from unseen target domains ($\mathcal{T}_1$ and $\mathcal{T}_2$) towards the source domain ($\mathcal{S}$) via our Source Matching (SM) strategy.
  • Figure 2: Contrast between anatomical structures strongly depends on the imaging modality. Left: a cardiac MR and its ground truth annotation; Right: the same anatomy visible in a cardiac CT. Note the differences between MR and CT in relative intensity of the left ventricle (red), the right ventricle (green), and the myocardium (cyan).
  • Figure 3: Our domain generalization approach: Images from source domain $\mathcal{S}$ are first augmented using conventional data augmentation (CDA). Then, our Semantic-aware Random Convolution (SRC) strategy applies a distinct nonlinear intensity augmentation for each available semantic label $c\in C$. After that, each augmented image is masked using its corresponding smoothed map $\mathbf{m}_{c}$, before being recombined yielding the final SRC-augmented image.
  • Figure 4: Exemplary abdominal (row 1) and cardiac (row 2) images are shown before (cols: 1, 3, 5, 7) and after (cols: 2, 4, 6, 8) applying the proposed Semantic-aware Random Convolution (SRC). Image contrast was adjusted for better visualization.
  • Figure 5: Exemplary abdominal (row 1) and cardiac (row 2) images are shown before (cols: 1, 3, 5, 7) and after (cols: 2, 4, 6, 8) applying the proposed Source Matching (SM). Image contrast was adjusted for better visualization.
  • ...and 2 more figures