Table of Contents
Fetching ...

Reconstruct Spine CT from Biplanar X-Rays via Diffusion Learning

Zhi Qiao, Xuhui Liu, Xiaopeng Wang, Runkun Liu, Xiantong Zhen, Pei Dong, Zhen Qian

TL;DR

The paper tackles reconstructing 3D spine CT from intraoperative biplanar X-rays when CT is unavailable. It introduces Diff2CT, a conditional diffusion model that leverages two orthogonal X-ray views as input, with X2NoiseNet converting 2D data into 3D-conditioned features and a projection-based loss enforcing 3D structural fidelity. On a lumbar pedicle screw dataset, Diff2CT achieves a higher perceptual quality (SSIM) and lower FID than state-of-the-art baselines, albeit with some trade-offs in voxel-wise metrics. This work demonstrates the potential of diffusion-based CT reconstruction for intraoperative guidance and paves the way for further improvements with real paired data and refined conditioning strategies.

Abstract

Intraoperative CT imaging serves as a crucial resource for surgical guidance; however, it may not always be readily accessible or practical to implement. In scenarios where CT imaging is not an option, reconstructing CT scans from X-rays can offer a viable alternative. In this paper, we introduce an innovative method for 3D CT reconstruction utilizing biplanar X-rays. Distinct from previous research that relies on conventional image generation techniques, our approach leverages a conditional diffusion process to tackle the task of reconstruction. More precisely, we employ a diffusion-based probabilistic model trained to produce 3D CT images based on orthogonal biplanar X-rays. To improve the structural integrity of the reconstructed images, we incorporate a novel projection loss function. Experimental results validate that our proposed method surpasses existing state-of-the-art benchmarks in both visual image quality and multiple evaluative metrics. Specifically, our technique achieves a higher Structural Similarity Index (SSIM) of 0.83, a relative increase of 10\%, and a lower Fréchet Inception Distance (FID) of 83.43, which represents a relative decrease of 25\%.

Reconstruct Spine CT from Biplanar X-Rays via Diffusion Learning

TL;DR

The paper tackles reconstructing 3D spine CT from intraoperative biplanar X-rays when CT is unavailable. It introduces Diff2CT, a conditional diffusion model that leverages two orthogonal X-ray views as input, with X2NoiseNet converting 2D data into 3D-conditioned features and a projection-based loss enforcing 3D structural fidelity. On a lumbar pedicle screw dataset, Diff2CT achieves a higher perceptual quality (SSIM) and lower FID than state-of-the-art baselines, albeit with some trade-offs in voxel-wise metrics. This work demonstrates the potential of diffusion-based CT reconstruction for intraoperative guidance and paves the way for further improvements with real paired data and refined conditioning strategies.

Abstract

Intraoperative CT imaging serves as a crucial resource for surgical guidance; however, it may not always be readily accessible or practical to implement. In scenarios where CT imaging is not an option, reconstructing CT scans from X-rays can offer a viable alternative. In this paper, we introduce an innovative method for 3D CT reconstruction utilizing biplanar X-rays. Distinct from previous research that relies on conventional image generation techniques, our approach leverages a conditional diffusion process to tackle the task of reconstruction. More precisely, we employ a diffusion-based probabilistic model trained to produce 3D CT images based on orthogonal biplanar X-rays. To improve the structural integrity of the reconstructed images, we incorporate a novel projection loss function. Experimental results validate that our proposed method surpasses existing state-of-the-art benchmarks in both visual image quality and multiple evaluative metrics. Specifically, our technique achieves a higher Structural Similarity Index (SSIM) of 0.83, a relative increase of 10\%, and a lower Fréchet Inception Distance (FID) of 83.43, which represents a relative decrease of 25\%.
Paper Structure (11 sections, 4 equations, 3 figures, 1 table)

This paper contains 11 sections, 4 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Representative CT reconstruction results shown in the mid-axial (1st row), mid-sagittal (2nd row), and mid-coronal (3rd row) views. Our method is compared with baseline methods (PSR, 3DCNN, Diff2CT) and the ground truth (GT).
  • Figure 2: Analysis of results (A lighter color indicates a larger SSIM value, and a larger diameter of the circle indicates a larger PSNR value)
  • Figure 3: A case for low resolution CT image with original pixel spacing of [0.9765,0.9765,0.8]. In preprocessing, we firstly resample all of CT images into a unified [2,2,2] pixel spacing.