Table of Contents
Fetching ...

Patch Triplet Similarity Purification for Guided Real-World Low-Dose CT Image Denoising

Junhao Long, Fengwei Yang, Juncheng Yan, Baoping Zhang, Chao Jin, Jian Yang, Changliang Zou, Jun Xu

TL;DR

This work tackles real-world low-dose CT denoising by leveraging clean non-contrast CT (NCCT) images as guidance and addressing spatial misalignment through a patch-level Patch Triplet Similarity Purification (PTSP) strategy. By replacing vanilla self-attention with cross-attention in SwinIR and HAT and training on highly similar LDCT/NDCT/NCCT patch triplets, the approach achieves superior denoising performance on both synthetic and real-world LDCT datasets, outperforming 15 baselines. Ablation studies confirm the value of NCCT guidance and the PTSP data selection, with best results at a patch size of 64 and PTSP threshold $s=0.85$, using $n=3$ segmentation points. The method demonstrates robust preservation of anatomical structures while reducing noise, suggesting meaningful clinical impact for safer LDCT imaging and potential data/code public release for reproducibility and broader adoption.

Abstract

Image denoising of low-dose computed tomography (LDCT) is an important problem for clinical diagnosis with reduced radiation exposure. Previous methods are mostly trained with pairs of synthetic or misaligned LDCT and normal-dose CT (NDCT) images. However, trained with synthetic noise or misaligned LDCT/NDCT image pairs, the denoising networks would suffer from blurry structure or motion artifacts. Since non-contrast CT (NCCT) images share the content characteristics to the corresponding NDCT images in a three-phase scan, they can potentially provide useful information for real-world LDCT image denoising. To exploit this aspect, in this paper, we propose to incorporate clean NCCT images as useful guidance for the learning of real-world LDCT image denoising networks. To alleviate the issue of spatial misalignment in training data, we design a new Patch Triplet Similarity Purification (PTSP) strategy to select highly similar patch (instead of image) triplets of LDCT, NDCT, and NCCT images for network training. Furthermore, we modify two image denoising transformers of SwinIR and HAT to accommodate the NCCT image guidance, by replacing vanilla self-attention with cross-attention. On our collected clinical dataset, the modified transformers trained with the data selected by our PTSP strategy show better performance than 15 comparison methods on real-world LDCT image denoising. Ablation studies validate the effectiveness of our NCCT image guidance and PTSP strategy. We will publicly release our data and code.

Patch Triplet Similarity Purification for Guided Real-World Low-Dose CT Image Denoising

TL;DR

This work tackles real-world low-dose CT denoising by leveraging clean non-contrast CT (NCCT) images as guidance and addressing spatial misalignment through a patch-level Patch Triplet Similarity Purification (PTSP) strategy. By replacing vanilla self-attention with cross-attention in SwinIR and HAT and training on highly similar LDCT/NDCT/NCCT patch triplets, the approach achieves superior denoising performance on both synthetic and real-world LDCT datasets, outperforming 15 baselines. Ablation studies confirm the value of NCCT guidance and the PTSP data selection, with best results at a patch size of 64 and PTSP threshold , using segmentation points. The method demonstrates robust preservation of anatomical structures while reducing noise, suggesting meaningful clinical impact for safer LDCT imaging and potential data/code public release for reproducibility and broader adoption.

Abstract

Image denoising of low-dose computed tomography (LDCT) is an important problem for clinical diagnosis with reduced radiation exposure. Previous methods are mostly trained with pairs of synthetic or misaligned LDCT and normal-dose CT (NDCT) images. However, trained with synthetic noise or misaligned LDCT/NDCT image pairs, the denoising networks would suffer from blurry structure or motion artifacts. Since non-contrast CT (NCCT) images share the content characteristics to the corresponding NDCT images in a three-phase scan, they can potentially provide useful information for real-world LDCT image denoising. To exploit this aspect, in this paper, we propose to incorporate clean NCCT images as useful guidance for the learning of real-world LDCT image denoising networks. To alleviate the issue of spatial misalignment in training data, we design a new Patch Triplet Similarity Purification (PTSP) strategy to select highly similar patch (instead of image) triplets of LDCT, NDCT, and NCCT images for network training. Furthermore, we modify two image denoising transformers of SwinIR and HAT to accommodate the NCCT image guidance, by replacing vanilla self-attention with cross-attention. On our collected clinical dataset, the modified transformers trained with the data selected by our PTSP strategy show better performance than 15 comparison methods on real-world LDCT image denoising. Ablation studies validate the effectiveness of our NCCT image guidance and PTSP strategy. We will publicly release our data and code.

Paper Structure

This paper contains 29 sections, 6 equations, 7 figures, 8 tables.

Figures (7)

  • Figure 1: Motivation of our NCCT image guidance and Patch Triplet Similarity Purification (PTSP) strategy. (a) The NCCT image enjoys structural similarity with the corresponding NDCT images from three-phase scanning. However, when overlapping LDCT images with the corresponding NDCT and NCCT images, there exists clear spatial misalignment. (b) Utilizing patch-level guidance instead of image-level one for network training.
  • Figure 2: Training data screening strategy based on RMSE v.s. our Patch Triplet Similarity Purification (PTSP) strategy. Mask Similarity is abbreviated as "Similarity". Subfigure (a) shows more significant structure differences among LDCT, NDCT, and NCCT image patches than (b) and (c). However, the RMSE metric gives the opposite conclusion. There are obvious "brightness" differences among the three image patches in (c). The average pixel value of the LDCT patch is 169.23, which is 25.26 higher than the average pixel value of the NCCT patch and 10.64 higher than that of the NDCT patch. (d) An LDCT image. (e) The corresponding NCCT image. (f) The denoised image of SwinIR SwinIR trained with the data screened by the RMSE metric (g) The denoised image of SwinIR by introducing PSP strategy psp2024. (h) The denoised image of SwinIR by introducing NCCT image guidance and our PTSP strategy. (i) The corresponding NDCT image. In general, the proposed NCCT image guidance and PTSP strategy for training data selection well recover the structure of the denoised image on real-world LDCT image denoising.
  • Figure 3: The proposed Patch Triplet Similarity Purification (PTSP) Strategy. It includes three main steps: 1) compute the discretized image patches $M_G$, $M_L$, and $M_N$ according to the set pixel interval; 2) obtain the difference maps $D_{LN}$ (or $D_{LG}$) by subtracting the discretized LDCT image patch from the discretized NDCT (or NCCT) image patch; 3) compute the corresponding mask similarity based on difference maps. When the mask similarity reaches a preset threshold $s$ (e.g., $s=0.85$), we include it in the training set of our clinical dataset.
  • Figure 4: Architectures of Self-Attention (SA) in vanilla SwinIR/HAT and Cross-Attention (CA) in modified SwinIR/HAT to incorporate the guidance of NCCT images.
  • Figure 5: Synthetic LDCT images based on the Poisson noise adding to the sinogram data of the corresponding NDCT image v.s. our synthetic LDCT image. (a) NDCT image from real world. (b) Synthetic LDCT image by adding Poisson noise adding to the sinogram data of the corresponding NDCT image. (c) Our synthetic LDCT image. (d) Real-world LDCT image.
  • ...and 2 more figures