C^2RV: Cross-Regional and Cross-View Learning for Sparse-View CBCT Reconstruction
Yiqun Lin, Jiewen Yang, Hualiang Wang, Xinpeng Ding, Wei Zhao, Xiaomeng Li
TL;DR
This work tackles sparse-view cone-beam CT reconstruction by framing it as a 3D representation problem and introducing cross-regional and cross-view learning. It proposes C^2RV, which combines multi-scale 3D volumetric representations (MS-3DV) with scale-view cross-attention (SVC-Att) to fuse voxel-aligned and view-aligned features for accurate attenuation estimation. Across chest and knee datasets, C^2RV achieves consistent, significant improvements over state-of-the-art methods in PSNR and SSIM, while also delivering better segmentation alignment in downstream tasks. The approach reduces reliance on dense projections, enabling high-quality reconstructions with fewer views and showing robustness to mild variations in scanning parameters.
Abstract
Cone beam computed tomography (CBCT) is an important imaging technology widely used in medical scenarios, such as diagnosis and preoperative planning. Using fewer projection views to reconstruct CT, also known as sparse-view reconstruction, can reduce ionizing radiation and further benefit interventional radiology. Compared with sparse-view reconstruction for traditional parallel/fan-beam CT, CBCT reconstruction is more challenging due to the increased dimensionality caused by the measurement process based on cone-shaped X-ray beams. As a 2D-to-3D reconstruction problem, although implicit neural representations have been introduced to enable efficient training, only local features are considered and different views are processed equally in previous works, resulting in spatial inconsistency and poor performance on complicated anatomies. To this end, we propose C^2RV by leveraging explicit multi-scale volumetric representations to enable cross-regional learning in the 3D space. Additionally, the scale-view cross-attention module is introduced to adaptively aggregate multi-scale and multi-view features. Extensive experiments demonstrate that our C^2RV achieves consistent and significant improvement over previous state-of-the-art methods on datasets with diverse anatomy.
