Cross-modal tumor segmentation using generative blending augmentation and self training

Guillaume Sallé; Pierre-Henri Conze; Julien Bert; Nicolas Boussion; Dimitris Visvikis; Vincent Jaouen

Cross-modal tumor segmentation using generative blending augmentation and self training

Guillaume Sallé, Pierre-Henri Conze, Julien Bert, Nicolas Boussion, Dimitris Visvikis, Vincent Jaouen

TL;DR

Domain shift and data scarcity limit cross-modal medical image segmentation. The authors propose Generative Blending Augmentation (GBA), which uses a SinGAN cascade to diversify tumor appearances and harmonize altered ROIs within CycleGAN-generated pseudo-targets, paired with iterative self-training to improve segmentation on unlabelled target modalities. Integrated into a conventional image-to-image translation plus segmentation pipeline, the approach achieves top performance on vestibular schwannoma segmentation in the CrossMoDA 2022 challenge, driven by improved Dice and ASSD metrics. The study highlights center-specific augmentation and self-training as effective strategies to close appearance gaps between centers and modalities, with applicability to other segmentation tasks under domain shifts.

Abstract

\textit{Objectives}: Data scarcity and domain shifts lead to biased training sets that do not accurately represent deployment conditions. A related practical problem is cross-modal image segmentation, where the objective is to segment unlabelled images using previously labelled datasets from other imaging modalities. \textit{Methods}: We propose a cross-modal segmentation method based on conventional image synthesis boosted by a new data augmentation technique called Generative Blending Augmentation (GBA). GBA leverages a SinGAN model to learn representative generative features from a single training image to diversify realistically tumor appearances. This way, we compensate for image synthesis errors, subsequently improving the generalization power of a downstream segmentation model. The proposed augmentation is further combined to an iterative self-training procedure leveraging pseudo labels at each pass. \textit{Results}: The proposed solution ranked first for vestibular schwannoma (VS) segmentation during the validation and test phases of the MICCAI CrossMoDA 2022 challenge, with best mean Dice similarity and average symmetric surface distance measures. \textit{Conclusion and significance}: Local contrast alteration of tumor appearances and iterative self-training with pseudo labels are likely to lead to performance improvements in a variety of segmentation contexts.

Cross-modal tumor segmentation using generative blending augmentation and self training

TL;DR

Abstract

Paper Structure (23 sections, 7 equations, 10 figures, 3 tables)

This paper contains 23 sections, 7 equations, 10 figures, 3 tables.

Introduction
Related Works
Cross-modal segmentation
Data augmentation
Method
Notations and general objective
Image-to-image translation
Generative Blending Augmentation
Segmentation with self-training
Validation setup
Challenge dataset, metrics and ranking
Implementation details
GBA and SinGAN parameters
Image resampling
Segmentation parameters
...and 8 more sections

Figures (10)

Figure 1: Example cross-modal tumor segmentation scenario (CrossMoDA 2022 challenge). Source domain images $X^S$ (blue frame, here contrast-enhanced T1 MRI) are provided with segmentation labels $Y^S$, while target domain images $X^T$ (red frame, here high-resolution T2 MRI) are unlabelled. Tumors show variable visual appearance between two imaging centers, with variations in modes of intensity and size (shown in kernel density plots).
Figure 2: SinGAN generative model from a single 2D image Shaham2019. Each generator/discriminator pair $(G_k, D_k)$ is trained independently and successively, from coarsest ($k=K$) to finest scale ($k=0$). Scale level is obtained by downsampling the original image by a factor $r^k$.
Figure 3: Generative blending augmentation (GBA) of a vestibular schwannoma in a synthetic hrT2 image generated by CycleGAN (CrossMoDA 2022 dataset). For each tumor slice, tumor intensity values are linearly scaled by a factor $\lambda$ to perform contrast alteration. A SinGAN generative model is then used to realistically blend the contrast-altered tumor to the original image. The cascaded SinGAN generators are trained on the central slice and applied to every slice containing the tumor.
Figure 4: Illustration of the instability of CycleGAN outputs in the ceT1$\rightarrow$hrT2 task of CrossMoDA 2022, affecting tumor appearance (close-up view). (a) original ceT1 image (b,c,d) synthetic hrT2 images for three CycleGAN runs with identical hyperparameters.
Figure 5: Iterative segmentation and self-training with GBA. Red and blue thumbnails respectively stand for real and generated images, while yellow-colored ROI segmentation masks represent pseudo-labels. After a first segmentation pass with a teacher segmentation model trained on pseudo-target images with GBA, we generate pseudo-labels using the unlabelled target set. Then, a student model is retrained, enriched with target images and pseudo-labels. The student replaces the teacher model and this loop is repeated several times to enhance pseudo-labels quality iteratively.
...and 5 more figures

Cross-modal tumor segmentation using generative blending augmentation and self training

TL;DR

Abstract

Cross-modal tumor segmentation using generative blending augmentation and self training

Authors

TL;DR

Abstract

Table of Contents

Figures (10)