Table of Contents
Fetching ...

Two-stage Deep Denoising with Self-guided Noise Attention for Multimodal Medical Images

S M A Sharif, Rizwan Ali Naqvi, Woong-Kee Loh

TL;DR

Medical image denoising faces challenges from diverse noise patterns across imaging modalities. The authors propose a two-stage deep denoising framework that first estimates residual noise with a Noise Estimation Network (NEN) and then refines reconstruction with a Reconstruction Network (RN) that employs a self-guided Noise Attention Block (NOB), enabling course-to-fine denoising. Trained on synthesized multi-pattern noise across MRI, X-ray, CT, skin, and microscopy, the method achieves state-of-the-art results for Gaussian and speckle denoising, demonstrated by notable gains in PSNR, SSIM, $\Delta E$, VIFP, and MSE, and extended to real-world low-dose CT via transfer learning. The approach is lightweight (≈0.586M parameters) and efficient (≈25.7 ms per image), supporting practical deployment and cross-modal generalization. Overall, the work provides a robust, generalizable denoising framework that leverages cross-modal information and noise-residual guidance to handle heterogeneous medical imaging noise.

Abstract

Medical image denoising is considered among the most challenging vision tasks. Despite the real-world implications, existing denoising methods have notable drawbacks as they often generate visual artifacts when applied to heterogeneous medical images. This study addresses the limitation of the contemporary denoising methods with an artificial intelligence (AI)-driven two-stage learning strategy. The proposed method learns to estimate the residual noise from the noisy images. Later, it incorporates a novel noise attention mechanism to correlate estimated residual noise with noisy inputs to perform denoising in a course-to-refine manner. This study also proposes to leverage a multi-modal learning strategy to generalize the denoising among medical image modalities and multiple noise patterns for widespread applications. The practicability of the proposed method has been evaluated with dense experiments. The experimental results demonstrated that the proposed method achieved state-of-the-art performance by significantly outperforming the existing medical image denoising methods in quantitative and qualitative comparisons. Overall, it illustrates a performance gain of 7.64 in Peak Signal-to-Noise Ratio (PSNR), 0.1021 in Structural Similarity Index (SSIM), 0.80 in DeltaE ($ΔE$), 0.1855 in Visual Information Fidelity Pixel-wise (VIFP), and 18.54 in Mean Squared Error (MSE) metrics.

Two-stage Deep Denoising with Self-guided Noise Attention for Multimodal Medical Images

TL;DR

Medical image denoising faces challenges from diverse noise patterns across imaging modalities. The authors propose a two-stage deep denoising framework that first estimates residual noise with a Noise Estimation Network (NEN) and then refines reconstruction with a Reconstruction Network (RN) that employs a self-guided Noise Attention Block (NOB), enabling course-to-fine denoising. Trained on synthesized multi-pattern noise across MRI, X-ray, CT, skin, and microscopy, the method achieves state-of-the-art results for Gaussian and speckle denoising, demonstrated by notable gains in PSNR, SSIM, , VIFP, and MSE, and extended to real-world low-dose CT via transfer learning. The approach is lightweight (≈0.586M parameters) and efficient (≈25.7 ms per image), supporting practical deployment and cross-modal generalization. Overall, the work provides a robust, generalizable denoising framework that leverages cross-modal information and noise-residual guidance to handle heterogeneous medical imaging noise.

Abstract

Medical image denoising is considered among the most challenging vision tasks. Despite the real-world implications, existing denoising methods have notable drawbacks as they often generate visual artifacts when applied to heterogeneous medical images. This study addresses the limitation of the contemporary denoising methods with an artificial intelligence (AI)-driven two-stage learning strategy. The proposed method learns to estimate the residual noise from the noisy images. Later, it incorporates a novel noise attention mechanism to correlate estimated residual noise with noisy inputs to perform denoising in a course-to-refine manner. This study also proposes to leverage a multi-modal learning strategy to generalize the denoising among medical image modalities and multiple noise patterns for widespread applications. The practicability of the proposed method has been evaluated with dense experiments. The experimental results demonstrated that the proposed method achieved state-of-the-art performance by significantly outperforming the existing medical image denoising methods in quantitative and qualitative comparisons. Overall, it illustrates a performance gain of 7.64 in Peak Signal-to-Noise Ratio (PSNR), 0.1021 in Structural Similarity Index (SSIM), 0.80 in DeltaE (), 0.1855 in Visual Information Fidelity Pixel-wise (VIFP), and 18.54 in Mean Squared Error (MSE) metrics.

Paper Structure

This paper contains 22 sections, 13 equations, 13 figures, 7 tables, 1 algorithm.

Figures (13)

  • Figure 1: Comparison between deep medical image denoising methods. The existing denoising methods are prone to produce smooth denoising results with visual artifacts. The top row depicts Gaussian denoising; the bottom row shows speckle denoising. In each row, Left to right: GT Image, Noisy Input, CAE gondara2016medical, ResCNN jifara2019medical, DnCNN jiang2018denoising, MMD el2022deep, DAE el2022efficient, DRAN sharif2020learning, and the proposed method.
  • Figure 2: Flowchart of the proposed method. Our proposed method includes data collection and noise simulation strategies. Later, we learned from the collected data and extensively evaluated our method on simulated and real-world noisy medical images.
  • Figure 3: Overview of noisy image generation. The proposed simulation method generates noisy-clean image pairs for training and evaluation.
  • Figure 4: Overview of the proposed two-stage network. Stage I of the proposed method learns to estimate the residual noise, and stage II leverages the estimated noise in refining the denoise outputs.
  • Figure 5: The architecture of the proposed noise attention block. It estimates the pixel-level attention over noisy input with a residual noise pattern.
  • ...and 8 more figures