FreqU-FNet: Frequency-Aware U-Net for Imbalanced Medical Image Segmentation
Ruiqi Xing
TL;DR
FreqU-FNet introduces a frequency-domain U-Net variant for imbalanced medical image segmentation by integrating a Frequency Domain Encoder that uses Daubechies wavelet downsampling and Fourier low-pass filtering, a Spatial Learnable Decoder with adaptive multi-branch upsampling, and a Frequency-Aware Loss to emphasize high-frequency structures in minority classes. The architecture fuses spectral and spatial cues through native-space and space-channel sampling pathways, guided by learnable fusion weights and a spatial auxiliary learning module. Empirical results on MSD Prostate, Pancreas, and Lung datasets demonstrate superior performance over CNN and Transformer baselines, particularly reducing Dice gaps between majority and minority classes. The study highlights the practical impact of reducing frequency aliasing and enhancing boundary precision for clinically important but underrepresented anatomical structures.
Abstract
Medical image segmentation faces persistent challenges due to severe class imbalance and the frequency-specific distribution of anatomical structures. Most conventional CNN-based methods operate in the spatial domain and struggle to capture minority class signals, often affected by frequency aliasing and limited spectral selectivity. Transformer-based models, while powerful in modeling global dependencies, tend to overlook critical local details necessary for fine-grained segmentation. To overcome these limitations, we propose FreqU-FNet, a novel U-shaped segmentation architecture operating in the frequency domain. Our framework incorporates a Frequency Encoder that leverages Low-Pass Frequency Convolution and Daubechies wavelet-based downsampling to extract multi-scale spectral features. To reconstruct fine spatial details, we introduce a Spatial Learnable Decoder (SLD) equipped with an adaptive multi-branch upsampling strategy. Furthermore, we design a frequency-aware loss (FAL) function to enhance minority class learning. Extensive experiments on multiple medical segmentation benchmarks demonstrate that FreqU-FNet consistently outperforms both CNN and Transformer baselines, particularly in handling under-represented classes, by effectively exploiting discriminative frequency bands.
