Table of Contents
Fetching ...

Adaptive Frequency Domain Alignment Network for Medical image segmentation

Zhanwei Li, Liang Li, Jiawan Zhang

TL;DR

This work tackles the challenge of scarce and mismatched annotations in medical image segmentation by introducing AFDAN, a cross-domain framework that aligns features in the frequency domain. It comprises three modules: Adversarial Domain Learning (ADL) to align amplitude spectra, Source-Target Frequency Fusion (STFF) to fuse low-frequency amplitude while preserving phase, and Spatial-Frequency Integration (SFI) to merge frequency and spatial representations with attention. The approach achieves state-of-the-art IoU scores on vitiligo (90.9%) and retinal vessels (82.6%), demonstrating robust transfer across challenging domain gaps. These results suggest AFDAN’s potential to enable high-fidelity segmentation in clinical settings where annotated data are limited or heterogeneous.

Abstract

High-quality annotated data plays a crucial role in achieving accurate segmentation. However, such data for medical image segmentation are often scarce due to the time-consuming and labor-intensive nature of manual annotation. To address this challenge, we propose the Adaptive Frequency Domain Alignment Network (AFDAN)--a novel domain adaptation framework designed to align features in the frequency domain and alleviate data scarcity. AFDAN integrates three core components to enable robust cross-domain knowledge transfer: an Adversarial Domain Learning Module that transfers features from the source to the target domain; a Source-Target Frequency Fusion Module that blends frequency representations across domains; and a Spatial-Frequency Integration Module that combines both frequency and spatial features to further enhance segmentation accuracy across domains. Extensive experiments demonstrate the effectiveness of AFDAN: it achieves an Intersection over Union (IoU) of 90.9% for vitiligo segmentation in the newly constructed VITILIGO2025 dataset and a competitive IoU of 82.6% on the retinal vessel segmentation benchmark DRIVE, surpassing existing state-of-the-art approaches.

Adaptive Frequency Domain Alignment Network for Medical image segmentation

TL;DR

This work tackles the challenge of scarce and mismatched annotations in medical image segmentation by introducing AFDAN, a cross-domain framework that aligns features in the frequency domain. It comprises three modules: Adversarial Domain Learning (ADL) to align amplitude spectra, Source-Target Frequency Fusion (STFF) to fuse low-frequency amplitude while preserving phase, and Spatial-Frequency Integration (SFI) to merge frequency and spatial representations with attention. The approach achieves state-of-the-art IoU scores on vitiligo (90.9%) and retinal vessels (82.6%), demonstrating robust transfer across challenging domain gaps. These results suggest AFDAN’s potential to enable high-fidelity segmentation in clinical settings where annotated data are limited or heterogeneous.

Abstract

High-quality annotated data plays a crucial role in achieving accurate segmentation. However, such data for medical image segmentation are often scarce due to the time-consuming and labor-intensive nature of manual annotation. To address this challenge, we propose the Adaptive Frequency Domain Alignment Network (AFDAN)--a novel domain adaptation framework designed to align features in the frequency domain and alleviate data scarcity. AFDAN integrates three core components to enable robust cross-domain knowledge transfer: an Adversarial Domain Learning Module that transfers features from the source to the target domain; a Source-Target Frequency Fusion Module that blends frequency representations across domains; and a Spatial-Frequency Integration Module that combines both frequency and spatial features to further enhance segmentation accuracy across domains. Extensive experiments demonstrate the effectiveness of AFDAN: it achieves an Intersection over Union (IoU) of 90.9% for vitiligo segmentation in the newly constructed VITILIGO2025 dataset and a competitive IoU of 82.6% on the retinal vessel segmentation benchmark DRIVE, surpassing existing state-of-the-art approaches.

Paper Structure

This paper contains 10 sections, 1 equation, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Segmentation results for vitiligo (first row) and retinal vessels (second row). Predicted regions are highlighted in red (best viewed in color). Note that both results were obtained without direct training on datasets annotated specifically for these target structures: the vitiligo segmentation model was trained using skin lesion annotations, while the retinal vessel segmentation was derived from a model trained with general fundus image annotations.
  • Figure 2: The source and target domain images, $x_s$ and $x_t$, are first transformed into the frequency domain via Fast Fourier transform (FFT). An Adaptive Fusion mechanism is then applied to accelerate domain adaptation. The second module, the Adversarial Domain Learning module, reduces domain shift, where $F_t$ denotes the target-domain features. Finally, the spatial and frequency features are jointly fed into the Spatial-Frequency Integration module. The integrated features are then concatenated, passed through the decoder, and multiplied with the attention map to produce the segmentation result, enabling effective integration of spatial and frequency features.