GaSpCT: Gaussian Splatting for Novel CT Projection View Synthesis
Emmanouil Nikolakakis, Utkarsh Gupta, Jonathan Vengosh, Justin Bui, Razvan Marinescu
TL;DR
GaSpCT introduces an implicit 3D scene representation for brain CT by adapting Gaussian Splatting to synthesize novel projection views from sparse 2D projections. It enhances the baseline with TV and Beta regularizers, initializes splats within an ellipsoid, and utilizes camera parameters derived from DICOM metadata to eliminate the need for Structure from Motion, formalized through a total loss ${\mathcal{L}_{Final}}$ that combines sparsity and perceptual terms. Empirical results on DRRs show GaSpCT outperforms MedNeRF, MipNeRF360, and a baseline GaS across PSNR/SSIM/LPIPS while requiring only 5–10 minutes of per-scan training and exhibiting a memory footprint of 27–42 MB, even with as little as 5% of views. This approach enables high-fidelity novel CT view synthesis with reduced radiation exposure and paves the way for curved-detector camera modeling and cross-subject latent representations in CT imaging.
Abstract
We present GaSpCT, a novel view synthesis and 3D scene representation method used to generate novel projection views for Computer Tomography (CT) scans. We adapt the Gaussian Splatting framework to enable novel view synthesis in CT based on limited sets of 2D image projections and without the need for Structure from Motion (SfM) methodologies. Therefore, we reduce the total scanning duration and the amount of radiation dose the patient receives during the scan. We adapted the loss function to our use-case by encouraging a stronger background and foreground distinction using two sparsity promoting regularizers: a beta loss and a total variation (TV) loss. Finally, we initialize the Gaussian locations across the 3D space using a uniform prior distribution of where the brain's positioning would be expected to be within the field of view. We evaluate the performance of our model using brain CT scans from the Parkinson's Progression Markers Initiative (PPMI) dataset and demonstrate that the rendered novel views closely match the original projection views of the simulated scan, and have better performance than other implicit 3D scene representations methodologies. Furthermore, we empirically observe reduced training time compared to neural network based image synthesis for sparse-view CT image reconstruction. Finally, the memory requirements of the Gaussian Splatting representations are reduced by 17% compared to the equivalent voxel grid image representations.
