X-GRM: Large Gaussian Reconstruction Model for Sparse-view X-rays to Computed Tomography

Yifan Liu; Wuyang Li; Weihao Yu; Chenxin Li; Alexandre Alahi; Max Meng; Yixuan Yuan

X-GRM: Large Gaussian Reconstruction Model for Sparse-view X-rays to Computed Tomography

Yifan Liu, Wuyang Li, Weihao Yu, Chenxin Li, Alexandre Alahi, Max Meng, Yixuan Yuan

TL;DR

X-GRM addresses the challenge of reconstructing 3D CT volumes from sparse-view X-ray projections by leveraging a large-scale, feed-forward transformer architecture and a flexible volume representation called VoxGS. The method decouples projection encoding (via an X-ray Reconstruction Transformer with an Encoder ViT and a Fusion ViT) from volume decoding using voxel-centered Gaussian primitives, enabling efficient, differentiable X-ray rendering and direct volume extraction. Trained on a large, diverse CT dataset, X-GRM achieves state-of-the-art reconstruction quality and fast inference across multiple sparse-view settings, with strong cross-dataset generalization and capable novel-view synthesis. This combination promises practical impact for low-dose, time-sensitive clinical workflows and opens avenues for integrated CT/X-ray applications and downstream rendering tasks.

Abstract

Computed Tomography serves as an indispensable tool in clinical workflows, providing non-invasive visualization of internal anatomical structures. Existing CT reconstruction works are limited to small-capacity model architecture and inflexible volume representation. In this work, we present X-GRM (X-ray Gaussian Reconstruction Model), a large feedforward model for reconstructing 3D CT volumes from sparse-view 2D X-ray projections. X-GRM employs a scalable transformer-based architecture to encode sparse-view X-ray inputs, where tokens from different views are integrated efficiently. Then, these tokens are decoded into a novel volume representation, named Voxel-based Gaussian Splatting (VoxGS), which enables efficient CT volume extraction and differentiable X-ray rendering. This combination of a high-capacity model and flexible volume representation, empowers our model to produce high-quality reconstructions from various testing inputs, including in-domain and out-domain X-ray projections. Our codes are available at: https://github.com/CUHK-AIM-Group/X-GRM.

X-GRM: Large Gaussian Reconstruction Model for Sparse-view X-rays to Computed Tomography

TL;DR

Abstract

X-GRM: Large Gaussian Reconstruction Model for Sparse-view X-rays to Computed Tomography

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)