Coarse-Fine View Attention Alignment-Based GAN for CT Reconstruction from Biplanar X-Rays
Zhi Qiao, Hanqiang Ouyang, Dongheng Chu, Huishu Yuan, Xiantong Zhen, Pei Dong, Zhen Qian
TL;DR
This work tackles 3D CT reconstruction from biplanar X-rays to aid surgical planning by exploiting complementary information from orthogonal views. The authors propose CVAA-GAN, a generative framework that uses a two-branch encoder with a CVAA (coarse-fine view attention alignment) module in the decoder to fuse view-specific features, augmented by a fine-distillation path and a Patch-3D discriminator. The training combines an adversarial loss under a least-squares GAN objective and a projection-based reconstruction loss on three orthogonal planes to enforce global shape consistency. On a lumbar vertebra DRR-based dataset, CVAA-GAN outperforms PSR, 3DCNN, and X2CT-GAN across multiple metrics (e.g., MAE, MSE, Cosine Similarity, PSNR, SSIM), demonstrating improved preservation of anatomy and sharper boundaries with potential clinical impact for intraoperative imaging when CT is unavailable.
Abstract
For surgical planning and intra-operation imaging, CT reconstruction using X-ray images can potentially be an important alternative when CT imaging is not available or not feasible. In this paper, we aim to use biplanar X-rays to reconstruct a 3D CT image, because biplanar X-rays convey richer information than single-view X-rays and are more commonly used by surgeons. Different from previous studies in which the two X-ray views were treated indifferently when fusing the cross-view data, we propose a novel attention-informed coarse-to-fine cross-view fusion method to combine the features extracted from the orthogonal biplanar views. This method consists of a view attention alignment sub-module and a fine-distillation sub-module that are designed to work together to highlight the unique or complementary information from each of the views. Experiments have demonstrated the superiority of our proposed method over the SOTA methods.
