Table of Contents
Fetching ...

Spectral U-Net: Enhancing Medical Image Segmentation via Spectral Decomposition

Yaopeng Peng, Milan Sonka, Danny Z. Chen

TL;DR

Spectral U-Net tackles information loss during down-sampling in medical image segmentation by integrating the Dual Tree Complex Wavelet Transform ($\mathrm{DTCWT}$) for down-sampling and its inverse for up-sampling. By decomposing feature maps into low- and high-frequency components across six orientations, the Wave-Block preserves detail while expanding channels, and the iWave-Block reconstructs resolution via $\mathrm{idtcwt}$ with skip-connection fusion. Evaluations on Retina Fluid, BRATS 2017, and LiTS 2017 within the nnU-Net framework show improved Dice scores and competitive Hausdorff distances, validating both effectiveness and practicality. The approach demonstrates that invertible, wavelet-based down-sampling and up-sampling can enhance segmentation accuracy without substantial computational overhead, offering a robust alternative to traditional pooling in medical imaging tasks.

Abstract

This paper introduces Spectral U-Net, a novel deep learning network based on spectral decomposition, by exploiting Dual Tree Complex Wavelet Transform (DTCWT) for down-sampling and inverse Dual Tree Complex Wavelet Transform (iDTCWT) for up-sampling. We devise the corresponding Wave-Block and iWave-Block, integrated into the U-Net architecture, aiming at mitigating information loss during down-sampling and enhancing detail reconstruction during up-sampling. In the encoder, we first decompose the feature map into high and low-frequency components using DTCWT, enabling down-sampling while mitigating information loss. In the decoder, we utilize iDTCWT to reconstruct higher-resolution feature maps from down-sampled features. Evaluations on the Retina Fluid, Brain Tumor, and Liver Tumor segmentation datasets with the nnU-Net framework demonstrate the superiority of the proposed Spectral U-Net.

Spectral U-Net: Enhancing Medical Image Segmentation via Spectral Decomposition

TL;DR

Spectral U-Net tackles information loss during down-sampling in medical image segmentation by integrating the Dual Tree Complex Wavelet Transform () for down-sampling and its inverse for up-sampling. By decomposing feature maps into low- and high-frequency components across six orientations, the Wave-Block preserves detail while expanding channels, and the iWave-Block reconstructs resolution via with skip-connection fusion. Evaluations on Retina Fluid, BRATS 2017, and LiTS 2017 within the nnU-Net framework show improved Dice scores and competitive Hausdorff distances, validating both effectiveness and practicality. The approach demonstrates that invertible, wavelet-based down-sampling and up-sampling can enhance segmentation accuracy without substantial computational overhead, offering a robust alternative to traditional pooling in medical imaging tasks.

Abstract

This paper introduces Spectral U-Net, a novel deep learning network based on spectral decomposition, by exploiting Dual Tree Complex Wavelet Transform (DTCWT) for down-sampling and inverse Dual Tree Complex Wavelet Transform (iDTCWT) for up-sampling. We devise the corresponding Wave-Block and iWave-Block, integrated into the U-Net architecture, aiming at mitigating information loss during down-sampling and enhancing detail reconstruction during up-sampling. In the encoder, we first decompose the feature map into high and low-frequency components using DTCWT, enabling down-sampling while mitigating information loss. In the decoder, we utilize iDTCWT to reconstruct higher-resolution feature maps from down-sampled features. Evaluations on the Retina Fluid, Brain Tumor, and Liver Tumor segmentation datasets with the nnU-Net framework demonstrate the superiority of the proposed Spectral U-Net.
Paper Structure (16 sections, 6 equations, 3 figures, 5 tables)

This paper contains 16 sections, 6 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: (a) The U-shape structure of our proposed Spectral U-Net, which applies DTCWT (b) in the encoding stage for spatial resolution reduction and iDTCWT (c) in the decoding stage for resolution reconstruction.
  • Figure 2: Visual examples from the Retina Fluid dataset (green for SRF, blue for PED). The dashed red boxes highlight that our method is able to capture intricate details of small objects and peripheral regions that are missed by nnUNet, Swin UNETR, and DconnNet.
  • Figure :