Layer-wise Noise Guided Selective Wavelet Reconstruction for Robust Medical Image Segmentation
Yuting Lu, Ziliang Wang, Weixin Xu, Wei Zhang, Yongqiang Zhao, Yang Yu, Xiaohong Zhang
TL;DR
This work tackles the brittleness of medical image segmentation under distribution shifts and adversarial perturbations by introducing LNG-SWR, a dual-domain framework that learns a frequency-bias prior during training via layer-wise noise and enforces it at inference through prior-guided selective wavelet reconstruction. The method performs frequency-domain band preservation, suppression, and re-weighting (LL preserved, HH suppressed, LH/HL gated) with a lightweight iDWT-based reconstruction and a shallow fusion, complemented by a train-time Noise Injection Unit and a bottleneck dynamic multi-scale fusion module. LNG-SWR is backbone-agnostic and can serve as a plug-in to adversarial training or operate with standard training, delivering robust performance with minimal overhead and improved boundary stability. Experiments on CT (LIDC-IDRI) and ultrasound (TN-SCUI) under a unified PGD-based robustness protocol with SSAH show consistent gains in clean Dice/IoU and reduced drops under strong attacks, with additive improvements when combined with AT; this demonstrates a practical, scalable path to robust medical image segmentation without sacrificing clean accuracy. The approach emphasizes frequency-domain control (DWT/iDWT subbands) and training-time perturbations to yield durable robustness in both adversarial and standard regimes, enabling clinically viable deployments.
Abstract
Clinical deployment requires segmentation models to stay stable under distribution shifts and perturbations. The mainstream solution is adversarial training (AT) to improve robustness; however, AT often brings a clean--robustness trade-off and high training/tuning cost, which limits scalability and maintainability in medical imaging. We propose \emph{Layer-wise Noise-Guided Selective Wavelet Reconstruction (LNG-SWR)}. During training, we inject small, zero-mean noise at multiple layers to learn a frequency-bias prior that steers representations away from noise-sensitive directions. We then apply prior-guided selective wavelet reconstruction on the input/feature branch to achieve frequency adaptation: suppress noise-sensitive bands, enhance directional structures and shape cues, and stabilize boundary responses while maintaining spectral consistency. The framework is backbone-agnostic and adds low additional inference overhead. It can serve as a plug-in enhancement to AT and also improves robustness without AT. On CT and ultrasound datasets, under a unified protocol with PGD-$L_{\infty}/L_{2}$ and SSAH, LNG-SWR delivers consistent gains on clean Dice/IoU and significantly reduces the performance drop under strong attacks; combining LNG-SWR with AT yields additive gains. When combined with adversarial training, robustness improves further without sacrificing clean accuracy, indicating an engineering-friendly and scalable path to robust segmentation. These results indicate that LNG-SWR provides a simple, effective, and engineering-friendly path to robust medical image segmentation in both adversarial and standard training regimes.
