Table of Contents
Fetching ...

WIA-LD2ND: Wavelet-based Image Alignment for Self-supervised Low-Dose CT Denoising

Haoyu Zhao, Yuliang Gu, Zhou Zhao, Bo Du, Yongchao Xu, Rui Yu

TL;DR

The paper tackles LDCT denoising under limited labeled data by introducing WIA-LD2ND, a self-supervised framework that leverages only NDCT data. It analyzes LDCT denoising from a frequency perspective, introduces Wavelet-based Image Alignment (WIA) to align NDCT and LDCT by perturbing high-frequency content, and proposes Frequency-Aware Multi-scale Loss (FAM) to enforce high-frequency fidelity in a multi-scale feature space using an online/target encoder with EMA. Experiments on Mayo-2016 and Mayo-2020 show that WIA-LD2ND achieves state-of-the-art performance among self-supervised/weakly-supervised methods, with notable gains in PSNR and SSIM and robust preservation of fine details; ablations confirm the contribution of each module, with WIA adding a modest parameter increase during training. The approach demonstrates practical promise for LDCT denoising in clinical settings by reducing data requirements and achieving high-quality reconstructions.

Abstract

In clinical examinations and diagnoses, low-dose computed tomography (LDCT) is crucial for minimizing health risks compared with normal-dose computed tomography (NDCT). However, reducing the radiation dose compromises the signal-to-noise ratio, leading to degraded quality of CT images. To address this, we analyze LDCT denoising task based on experimental results from the frequency perspective, and then introduce a novel self-supervised CT image denoising method called WIA-LD2ND, only using NDCT data. The proposed WIA-LD2ND comprises two modules: Wavelet-based Image Alignment (WIA) and Frequency-Aware Multi-scale Loss (FAM). First, WIA is introduced to align NDCT with LDCT by mainly adding noise to the high-frequency components, which is the main difference between LDCT and NDCT. Second, to better capture high-frequency components and detailed information, Frequency-Aware Multi-scale Loss (FAM) is proposed by effectively utilizing multi-scale feature space. Extensive experiments on two public LDCT denoising datasets demonstrate that our WIA-LD2ND, only uses NDCT, outperforms existing several state-of-the-art weakly-supervised and self-supervised methods. Source code is available at https://github.com/zhaohaoyu376/WI-LD2ND.

WIA-LD2ND: Wavelet-based Image Alignment for Self-supervised Low-Dose CT Denoising

TL;DR

The paper tackles LDCT denoising under limited labeled data by introducing WIA-LD2ND, a self-supervised framework that leverages only NDCT data. It analyzes LDCT denoising from a frequency perspective, introduces Wavelet-based Image Alignment (WIA) to align NDCT and LDCT by perturbing high-frequency content, and proposes Frequency-Aware Multi-scale Loss (FAM) to enforce high-frequency fidelity in a multi-scale feature space using an online/target encoder with EMA. Experiments on Mayo-2016 and Mayo-2020 show that WIA-LD2ND achieves state-of-the-art performance among self-supervised/weakly-supervised methods, with notable gains in PSNR and SSIM and robust preservation of fine details; ablations confirm the contribution of each module, with WIA adding a modest parameter increase during training. The approach demonstrates practical promise for LDCT denoising in clinical settings by reducing data requirements and achieving high-quality reconstructions.

Abstract

In clinical examinations and diagnoses, low-dose computed tomography (LDCT) is crucial for minimizing health risks compared with normal-dose computed tomography (NDCT). However, reducing the radiation dose compromises the signal-to-noise ratio, leading to degraded quality of CT images. To address this, we analyze LDCT denoising task based on experimental results from the frequency perspective, and then introduce a novel self-supervised CT image denoising method called WIA-LD2ND, only using NDCT data. The proposed WIA-LD2ND comprises two modules: Wavelet-based Image Alignment (WIA) and Frequency-Aware Multi-scale Loss (FAM). First, WIA is introduced to align NDCT with LDCT by mainly adding noise to the high-frequency components, which is the main difference between LDCT and NDCT. Second, to better capture high-frequency components and detailed information, Frequency-Aware Multi-scale Loss (FAM) is proposed by effectively utilizing multi-scale feature space. Extensive experiments on two public LDCT denoising datasets demonstrate that our WIA-LD2ND, only uses NDCT, outperforms existing several state-of-the-art weakly-supervised and self-supervised methods. Source code is available at https://github.com/zhaohaoyu376/WI-LD2ND.
Paper Structure (9 sections, 5 equations, 4 figures, 2 tables)

This paper contains 9 sections, 5 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: (a) Visualization of results of NDCT and LDCT after Discrete Wavelet Transform (DWT). The primary differences between NDCT and LDCT are at the high frequency components $[LH, HL, HH]$. (b-c) Visualize the normalized low-frequency (LF) component $LL$ features of NDCT and LDCT, while (d-e) display the normalized high-frequency (HF) component $[LH, HL, HH]$ features. We adopt the first residual block of pre-trained ResNet-18 he2016deep to extract image features.
  • Figure 2: Overview of our proposed WIA-LD2ND. NDCT image $x$ is passed through the WIA and then fed into the reconstruction network. The high-frequency components of the denoised CT $y$ and the input image $x$ are both fed into FAM to compute the loss capturing high-frequency components in multi-scale feature space.
  • Figure 3: (a-c) Visualization of NDCT, residual between NDCT and result of BM3D dabov2007image (a classical denoising method), and high-frequency components in spatial domain. The residual is converted into a clean binary image for clarity. We filter high-frequency band from image and then convert the result into a binary image. (d-e) Visualization of the tSNE images of feature distribution on the NDCT, LDCT, and their respective transformations after applying WIA on Mayo-2016 dataset. We adopt the first residual block of pre-trained ResNet-18 to extract image features.
  • Figure 4: Qualitative comparison of different methods on the Mayo-2020 dataset moen2021low.