GaSpCT: Gaussian Splatting for Novel CT Projection View Synthesis

Emmanouil Nikolakakis; Utkarsh Gupta; Jonathan Vengosh; Justin Bui; Razvan Marinescu

GaSpCT: Gaussian Splatting for Novel CT Projection View Synthesis

Emmanouil Nikolakakis, Utkarsh Gupta, Jonathan Vengosh, Justin Bui, Razvan Marinescu

TL;DR

GaSpCT introduces an implicit 3D scene representation for brain CT by adapting Gaussian Splatting to synthesize novel projection views from sparse 2D projections. It enhances the baseline with TV and Beta regularizers, initializes splats within an ellipsoid, and utilizes camera parameters derived from DICOM metadata to eliminate the need for Structure from Motion, formalized through a total loss ${\mathcal{L}_{Final}}$ that combines sparsity and perceptual terms. Empirical results on DRRs show GaSpCT outperforms MedNeRF, MipNeRF360, and a baseline GaS across PSNR/SSIM/LPIPS while requiring only 5–10 minutes of per-scan training and exhibiting a memory footprint of 27–42 MB, even with as little as 5% of views. This approach enables high-fidelity novel CT view synthesis with reduced radiation exposure and paves the way for curved-detector camera modeling and cross-subject latent representations in CT imaging.

Abstract

We present GaSpCT, a novel view synthesis and 3D scene representation method used to generate novel projection views for Computer Tomography (CT) scans. We adapt the Gaussian Splatting framework to enable novel view synthesis in CT based on limited sets of 2D image projections and without the need for Structure from Motion (SfM) methodologies. Therefore, we reduce the total scanning duration and the amount of radiation dose the patient receives during the scan. We adapted the loss function to our use-case by encouraging a stronger background and foreground distinction using two sparsity promoting regularizers: a beta loss and a total variation (TV) loss. Finally, we initialize the Gaussian locations across the 3D space using a uniform prior distribution of where the brain's positioning would be expected to be within the field of view. We evaluate the performance of our model using brain CT scans from the Parkinson's Progression Markers Initiative (PPMI) dataset and demonstrate that the rendered novel views closely match the original projection views of the simulated scan, and have better performance than other implicit 3D scene representations methodologies. Furthermore, we empirically observe reduced training time compared to neural network based image synthesis for sparse-view CT image reconstruction. Finally, the memory requirements of the Gaussian Splatting representations are reduced by 17% compared to the equivalent voxel grid image representations.

GaSpCT: Gaussian Splatting for Novel CT Projection View Synthesis

TL;DR

that combines sparsity and perceptual terms. Empirical results on DRRs show GaSpCT outperforms MedNeRF, MipNeRF360, and a baseline GaS across PSNR/SSIM/LPIPS while requiring only 5–10 minutes of per-scan training and exhibiting a memory footprint of 27–42 MB, even with as little as 5% of views. This approach enables high-fidelity novel CT view synthesis with reduced radiation exposure and paves the way for curved-detector camera modeling and cross-subject latent representations in CT imaging.

Abstract

Paper Structure (16 sections, 4 equations, 4 figures, 2 tables)

This paper contains 16 sections, 4 equations, 4 figures, 2 tables.

Introduction
Methodology
GaSpCT
Gaussian Splatting
Total Variation Regularization
Beta Distribution Regularizer
Total Loss Function
Experiments
Dataset
Digitally Reconstructed Radiograph
Challenges with Structure from Motion on CT Images
Camera Extrinsics and Intrinsics of CT Images
Setup
Results
Conclusion & Future Work
...and 1 more sections

Figures (4)

Figure 1: Optimization of 3D Gaussians initialized as an ellipsoid. Using the camera poses extracted from the DICOM metadata, we can perform forward passes and backpropagation through the differential Gaussian rasterizer.
Figure 2: Top row shows the original 3D voxel-based DICOM image retrieved from the PPMI dataset. This image is used as an input to the DRR algorithm to generate the projection image views. The bottom row shows the resulting projection views: 1) 0° 2) 90° 3) 180°
Figure 3: Four different angular view a) 0° b) 90° c) 180° d) 270°. The top row contains the ground truth images, while the bottom row contains the rendered images for the equivalent camera poses.
Figure 4: Four different renderings for the same projection view when testing with a) 50% b) 25% c) 10% d) 5% of the total images.

GaSpCT: Gaussian Splatting for Novel CT Projection View Synthesis

TL;DR

Abstract

GaSpCT: Gaussian Splatting for Novel CT Projection View Synthesis

Authors

TL;DR

Abstract

Table of Contents

Figures (4)