Table of Contents
Fetching ...

Multi-Scale Texture Loss for CT denoising with GANs

Francesco Di Feola, Lorenzo Tronchin, Valerio Guarrasi, Paolo Soda

TL;DR

The paper tackles texture fidelity in GAN-based LDCT denoising by introducing Multi-Scale Texture Loss (MSTLF), a differentiable, self-attention–aggregated texture representation integrated into the training objective. By evaluating MSTLF across three prominent GAN backbones (Pix2Pix, CycleGAN, UNIT) and three public datasets, the authors demonstrate improvements in both pixel-wise and perceptual quality metrics, with MSTLF-average excelling on paired measures (PSNR, MSE, SSIM) and MSTLF-attention boosting perceptual quality (NIQE, PIQUE). The approach is data- and architecture-agnostic within the studied CT denoising context and is supported by per-patient statistical analyses that underscore robust gains across datasets. The work provides a practical, architecture-robust method to enhance texture realism in LDCT denoising, with code available for reproducibility.

Abstract

Generative Adversarial Networks (GANs) have proved as a powerful framework for denoising applications in medical imaging. However, GAN-based denoising algorithms still suffer from limitations in capturing complex relationships within the images. In this regard, the loss function plays a crucial role in guiding the image generation process, encompassing how much a synthetic image differs from a real image. To grasp highly complex and non-linear textural relationships in the training process, this work presents a novel approach to capture and embed multi-scale texture information into the loss function. Our method introduces a differentiable multi-scale texture representation of the images dynamically aggregated by a self-attention layer, thus exploiting end-to-end gradient-based optimization. We validate our approach by carrying out extensive experiments in the context of low-dose CT denoising, a challenging application that aims to enhance the quality of noisy CT scans. We utilize three publicly available datasets, including one simulated and two real datasets. The results are promising as compared to other well-established loss functions, being also consistent across three different GAN architectures. The code is available at: https://github.com/TrainLaboratory/MultiScaleTextureLoss-MSTLF

Multi-Scale Texture Loss for CT denoising with GANs

TL;DR

The paper tackles texture fidelity in GAN-based LDCT denoising by introducing Multi-Scale Texture Loss (MSTLF), a differentiable, self-attention–aggregated texture representation integrated into the training objective. By evaluating MSTLF across three prominent GAN backbones (Pix2Pix, CycleGAN, UNIT) and three public datasets, the authors demonstrate improvements in both pixel-wise and perceptual quality metrics, with MSTLF-average excelling on paired measures (PSNR, MSE, SSIM) and MSTLF-attention boosting perceptual quality (NIQE, PIQUE). The approach is data- and architecture-agnostic within the studied CT denoising context and is supported by per-patient statistical analyses that underscore robust gains across datasets. The work provides a practical, architecture-robust method to enhance texture realism in LDCT denoising, with code available for reproducibility.

Abstract

Generative Adversarial Networks (GANs) have proved as a powerful framework for denoising applications in medical imaging. However, GAN-based denoising algorithms still suffer from limitations in capturing complex relationships within the images. In this regard, the loss function plays a crucial role in guiding the image generation process, encompassing how much a synthetic image differs from a real image. To grasp highly complex and non-linear textural relationships in the training process, this work presents a novel approach to capture and embed multi-scale texture information into the loss function. Our method introduces a differentiable multi-scale texture representation of the images dynamically aggregated by a self-attention layer, thus exploiting end-to-end gradient-based optimization. We validate our approach by carrying out extensive experiments in the context of low-dose CT denoising, a challenging application that aims to enhance the quality of noisy CT scans. We utilize three publicly available datasets, including one simulated and two real datasets. The results are promising as compared to other well-established loss functions, being also consistent across three different GAN architectures. The code is available at: https://github.com/TrainLaboratory/MultiScaleTextureLoss-MSTLF
Paper Structure (8 sections, 14 equations, 8 figures, 12 tables)

This paper contains 8 sections, 14 equations, 8 figures, 12 tables.

Figures (8)

  • Figure 1: Visual comparison of denoised CT slices from the LIDC/IDRI dataset ( Dataset B). We show all loss function configurations for CycleGAN. The images are organized into "Reference Images" (panel(a)), "Competitors" (panel (b)) and "Our Approach" (panel (c)). For each configuration, results are displayed in the image domain (left image, with a display window of [-1200, 200] HU) and the gradient domain (right image). Zoomed ROIs highlight key regions demonstrating noise reduction effectiveness.
  • Figure 2: Visual comparison of denoised CT slices from the ELCAP dataset ( Dataset C). We show all loss function configurations for CycleGAN. The images are organized into "Reference Images" (panel(a)), "Competitors" (panel (b)) and "Our Approach" (panel (c)). For each configuration, results are displayed in the image domain (left image, with a display window of [-1200, 200] HU) and the gradient domain (right image). Zoomed ROIs highlight key regions demonstrating noise reduction effectiveness.
  • Figure 3: Visual comparison of denoised CT slices from the Mayo Clinic simulated data ( Dataset A). We show all loss function configurations for Pix2Pix. The images are organized into "Reference Images" (panel(a)), "Competitors" (panel (b)) and "Our Approach" (panel (c)). For each configuration, results are displayed in the image domain (left image, with a display window of [-1200, 200] HU) and the gradient domain (right image). Zoomed ROIs highlight key regions demonstrating noise reduction effectiveness.
  • Figure 4: Visual comparison of denoised CT slices from the LIDC/IDRI dataset ( Dataset B). We show all loss function configurations for Pix2Pix. The images are organized into "Reference Images" (panel(a)), "Competitors" (panel (b)) and "Our Approach" (panel (c)). For each configuration, results are displayed in the image domain (left image, with a display window of [-1200, 200] HU) and the gradient domain (right image). Zoomed ROIs highlight key regions demonstrating noise reduction effectiveness.
  • Figure 5: Visual comparison of denoised CT slices from the ELCAP dataset ( Dataset C). We show all loss function configurations for Pix2Pix. The images are organized into "Reference Images" (panel(a)), "Competitors" (panel (b)) and "Our Approach" (panel (c)). For each configuration, results are displayed in the image domain (left image, with a display window of [-1200, 200] HU) and the gradient domain (right image). Zoomed ROIs highlight key regions demonstrating noise reduction effectiveness.
  • ...and 3 more figures