Table of Contents
Fetching ...

Deep Few-view High-resolution Photon-counting CT at Halved Dose for Extremity Imaging

Mengzhou Li, Chuang Niu, Ge Wang, Maya R Amma, Krishna M Chapagain, Stefan Gabrielson, Andrew Li, Kevin Jonker, Niels de Ruiter, Jennifer A Clark, Phil Butler, Anthony Butler, Hengyong Yu

TL;DR

This work tackles radiation-dose reduction in high-resolution extremity PCCT by introducing a deep learning–driven, patch-based volumetric reconstruction pipeline that combines a low-noise structural prior, deep iterative refinement (DIR) via an ADMM framework, and texture appearance tuning. The approach leverages a Volumetric Sparse Representation Network (VSR-Net) trained on synthetic data and refined with a model-based proximal operator, followed by Residual Fourier Channel Attention Network (RFCAN) post-processing to align texture with clinical references. By partitioning large volumes, using patch-based processing, and sharing a low-noise prior across channels, the method addresses memory and domain-gap challenges, achieving halved-dose and doubled-speed reconstruction validated in a NZ clinical trial. Phantom tests and an 8-patient reader study indicate comparable or superior diagnostic image quality and spectral fidelity relative to full-view reconstructions, supporting potential clinical translation for dose reduction in HR PCCT.

Abstract

X-ray photon-counting computed tomography (PCCT) for extremity allows multi-energy high-resolution (HR) imaging but its radiation dose can be further improved. Despite the great potential of deep learning techniques, their application in HR volumetric PCCT reconstruction has been challenged by the large memory burden, training data scarcity, and domain gap issues. In this paper, we propose a deep learning-based approach for PCCT image reconstruction at halved dose and doubled speed validated in a New Zealand clinical trial. Specifically, we design a patch-based volumetric refinement network to alleviate the GPU memory limitation, train network with synthetic data, and use model-based iterative refinement to bridge the gap between synthetic and clinical data. Our results in a reader study of 8 patients from the clinical trial demonstrate a great potential to cut the radiation dose to half that of the clinical PCCT standard without compromising image quality and diagnostic value.

Deep Few-view High-resolution Photon-counting CT at Halved Dose for Extremity Imaging

TL;DR

This work tackles radiation-dose reduction in high-resolution extremity PCCT by introducing a deep learning–driven, patch-based volumetric reconstruction pipeline that combines a low-noise structural prior, deep iterative refinement (DIR) via an ADMM framework, and texture appearance tuning. The approach leverages a Volumetric Sparse Representation Network (VSR-Net) trained on synthetic data and refined with a model-based proximal operator, followed by Residual Fourier Channel Attention Network (RFCAN) post-processing to align texture with clinical references. By partitioning large volumes, using patch-based processing, and sharing a low-noise prior across channels, the method addresses memory and domain-gap challenges, achieving halved-dose and doubled-speed reconstruction validated in a NZ clinical trial. Phantom tests and an 8-patient reader study indicate comparable or superior diagnostic image quality and spectral fidelity relative to full-view reconstructions, supporting potential clinical translation for dose reduction in HR PCCT.

Abstract

X-ray photon-counting computed tomography (PCCT) for extremity allows multi-energy high-resolution (HR) imaging but its radiation dose can be further improved. Despite the great potential of deep learning techniques, their application in HR volumetric PCCT reconstruction has been challenged by the large memory burden, training data scarcity, and domain gap issues. In this paper, we propose a deep learning-based approach for PCCT image reconstruction at halved dose and doubled speed validated in a New Zealand clinical trial. Specifically, we design a patch-based volumetric refinement network to alleviate the GPU memory limitation, train network with synthetic data, and use model-based iterative refinement to bridge the gap between synthetic and clinical data. Our results in a reader study of 8 patients from the clinical trial demonstrate a great potential to cut the radiation dose to half that of the clinical PCCT standard without compromising image quality and diagnostic value.
Paper Structure (23 sections, 9 equations, 13 figures, 6 tables)

This paper contains 23 sections, 9 equations, 13 figures, 6 tables.

Figures (13)

  • Figure 1: Deep few-view PCCT workflow. (a) A less noisy structural prior is reconstructed by summing counts from all channels and using a multi-scale iterative reconstruction (MS-IR) technique; (b) for image reconstruction in each channel, the structural prior is iteratively refined using a Volumetric Sparse Representation Network (VSR-Net) and model-based guidance with the projection measurements in an Alternating Direction Method of Multipliers (ADMM) framework; and (c) the multi-channel images are further refined using a Residual Fourier Channel Attention Network (RFCAN) for alignment with the MARS full-dose reconstruction, and followed by further polishing with the Simultaneous Iterative Reconstruction Technique (SIRT) to generate similar image sharpness and noise characteristics that radiologists prefer.
  • Figure 2: Architecture of our volumetric sparse representation network (VSR-Net). (a) This light-weight network takes a small cubic patch as input and outputs a denoised patch, with 3D pixel shuffle operations and grouped convolutions; and (b) the downscaling and upscaling of the feature maps are achieved through 3D pixel unshuffle and shuffle operations (illustrated in (c)) combined with two 3D grouped convolutional layers. Note that a color-coded number above each convolutional operation denotes the number of groups used, while the number underneath the feature map indicates the number of channels.
  • Figure 3: Motivation and architecture of the residual Fourier channel attention network (RFCAN). It mainly intends to (i) correct contrast shift, and (ii) adjust noise texture to match clinical references. (a) The difference images of deep iterative refinement (DIR) results and RFCAN results, against the MARS full-view reconstruction reference, reveal clear misregistration in the upper part of the volume---highlighting the need for a misalignment-insensitive loss function. A slight value shift for bones is also observed in the DIR result, indicated by the dark region pinpointed by arrows. (b) Zoomed-in views of a flat tissue region show a notable texture discrepancy between the DIR result and the reference, despite their similar noise levels (standard deviation) and mean values. (c) Noise power spectrum (NPS) curves, estimated from the flat tissue region, further confirm this texture difference, motivating the alignment of bone values and the adjustment of noise characteristics to reduce perception bias using RFCAN. (d) The proposed RFCAN consists of 15 Fourier channel attention residual blocks (FCA-ResBlocks) built upon the attention layers with FCA (FCA-Layer). It functions as a post-processing procedure that was applied to the multi-channel DIR outputs.
  • Figure 4: Interleaved updating for large volume reconstruction. (a) Partitioning the projections and image volume to form a batch of tasks for sub-volume reconstruction, and (b) combining the results in an interleaved pattern with slices at one or both ends trimmed off to ensure data completeness.
  • Figure 5: Representative images reconstructed using the competing methods on simulated data. (a) The full-view reconstructions with FDK, SIRT-TV, and our method displayed against the ground truth, including exemplary axial, coronal, and sagittal views from top to bottom; (b) the reconstructions from halved views; (c) error map of half-view reconstructions against ground truth for the SIRT-TV (left half) and the proposed method (right half); and (d) magnified regions from the coronal and sagittal views as indicated by the green and orange boxes respectively and displayed in the descent order of image sharpness and structural fidelity: ground truth, our full-view and half-view reconstructions, and full-view and half-view reconstructions with SIRT-TV from top to bottom. The display window settings are W/L:400/50 HU for images and W/L:200/0 HU for error maps. The red arrows highlight the structural details that are recovered for our methods but challenging for SIRT-TV, e.g., resulting in loss of resolution as indicated in (c) and a blotchy and cartoonish appearance as shown in (d).
  • ...and 8 more figures