Table of Contents
Fetching ...

FIESTA: Fourier-Based Semantic Augmentation with Uncertainty Guidance for Enhanced Domain Generalizability in Medical Image Segmentation

Kwanseok Oh, Eunjin Jeon, Da-Woon Heo, Yooseung Shin, Heung-Il Suk

TL;DR

FIESTA targets single-source domain generalization in medical image segmentation by introducing a Fourier-based semantic augmentation framework that modulates amplitude and phase in the frequency domain. Central to FIESTA is the Fourier Augmentative Transformer (FAT), which uses amplitude masking/intra-modulation and phase attention, plus a self-Mixup strategy, to generate semantically meaningful diversity. The method further incorporates epistemic uncertainty to guide augmentation, focusing learning on ambiguous regions and improving cross-domain robustness across three MIS tasks, with scalability demonstrated via integration with SAM/MedSAM. The approach yields state-of-the-art or competitive Dice scores, offers a practical, plug-and-play augmentation solution, and provides code for reproducibility.

Abstract

Single-source domain generalization (SDG) in medical image segmentation (MIS) aims to generalize a model using data from only one source domain to segment data from an unseen target domain. Despite substantial advances in SDG with data augmentation, existing methods often fail to fully consider the details and uncertain areas prevalent in MIS, leading to mis-segmentation. This paper proposes a Fourier-based semantic augmentation method called FIESTA using uncertainty guidance to enhance the fundamental goals of MIS in an SDG context by manipulating the amplitude and phase components in the frequency domain. The proposed Fourier augmentative transformer addresses semantic amplitude modulation based on meaningful angular points to induce pertinent variations and harnesses the phase spectrum to ensure structural coherence. Moreover, FIESTA employs epistemic uncertainty to fine-tune the augmentation process, improving the ability of the model to adapt to diverse augmented data and concentrate on areas with higher ambiguity. Extensive experiments across three cross-domain scenarios demonstrate that FIESTA surpasses recent state-of-the-art SDG approaches in segmentation performance and significantly contributes to boosting the applicability of the model in medical imaging modalities.

FIESTA: Fourier-Based Semantic Augmentation with Uncertainty Guidance for Enhanced Domain Generalizability in Medical Image Segmentation

TL;DR

FIESTA targets single-source domain generalization in medical image segmentation by introducing a Fourier-based semantic augmentation framework that modulates amplitude and phase in the frequency domain. Central to FIESTA is the Fourier Augmentative Transformer (FAT), which uses amplitude masking/intra-modulation and phase attention, plus a self-Mixup strategy, to generate semantically meaningful diversity. The method further incorporates epistemic uncertainty to guide augmentation, focusing learning on ambiguous regions and improving cross-domain robustness across three MIS tasks, with scalability demonstrated via integration with SAM/MedSAM. The approach yields state-of-the-art or competitive Dice scores, offers a practical, plug-and-play augmentation solution, and provides code for reproducibility.

Abstract

Single-source domain generalization (SDG) in medical image segmentation (MIS) aims to generalize a model using data from only one source domain to segment data from an unseen target domain. Despite substantial advances in SDG with data augmentation, existing methods often fail to fully consider the details and uncertain areas prevalent in MIS, leading to mis-segmentation. This paper proposes a Fourier-based semantic augmentation method called FIESTA using uncertainty guidance to enhance the fundamental goals of MIS in an SDG context by manipulating the amplitude and phase components in the frequency domain. The proposed Fourier augmentative transformer addresses semantic amplitude modulation based on meaningful angular points to induce pertinent variations and harnesses the phase spectrum to ensure structural coherence. Moreover, FIESTA employs epistemic uncertainty to fine-tune the augmentation process, improving the ability of the model to adapt to diverse augmented data and concentrate on areas with higher ambiguity. Extensive experiments across three cross-domain scenarios demonstrate that FIESTA surpasses recent state-of-the-art SDG approaches in segmentation performance and significantly contributes to boosting the applicability of the model in medical imaging modalities.
Paper Structure (29 sections, 20 equations, 7 figures, 5 tables, 2 algorithms)

This paper contains 29 sections, 20 equations, 7 figures, 5 tables, 2 algorithms.

Figures (7)

  • Figure 1: Schematic overview of FIESTA. Context-aware augmentation provides diverse changes throughout the global context via the proposed Fourier augmentative transformer (FAT), whereas location-aware augmentation uses the FAT but additionally adjusts the segment-specific location styles using the B$\Acute{e}$zier curve. Building on these two augmentation phases, uncertainty-guided mutual augmentation further enforces segmentation in ambiguous regions by generating augmented images via uncertainty guidance.
  • Figure 2: Overall Fourier augmentative transformer (FAT) framework, applying novel masking and intra-modulation techniques to the amplitude spectrum with phase attention via an advanced filtering strategy (FFT: fast Fourier transform, iFFT: inverse FFT).
  • Figure 3: Examples of transformed raw images according to the B$\Acute{e}$zier curve variant.
  • Figure 4: Qualitative analysis of three cross-domain scenarios: abdominal CT and MRI for cross-modality segmentation (Rows 1 and 2), cardiac bSSFP MRI and LGE MRI for cross-sequence segmentation (Rows 3 and 4), and prostate MRI from Site A to F for cross-site segmentation (bottom row). The rightmost images are the training data, and the remaining images represent the segmentation results of each method evaluated with the unseen target data.
  • Figure 5: Comparative results of segmentation performance ($\%$) against prevalent corruption methods ( i.e., F-Cutout, CutMix, Mixup, and SA-Mixup) across cross-domain scenarios. These approaches effectively manipulate the amplitude or phase component within the frequency domain.
  • ...and 2 more figures