Multimodal Learning With Intraoperative CBCT & Variably Aligned Preoperative CT Data To Improve Segmentation
Maximilian E. Tschuchnig, Philipp Steininger, Michael Gadermayr
TL;DR
This work addresses segmentation in intraoperative CBCT, which is often degraded by artifacts, by fusing roughly aligned preoperative CT data using an early-fusion multimodal approach. Using a LiTS-based dataset with synthetic CBCT generated via DRRs and misalignment modeled by affine and elastic deformations, the authors evaluate 45 multimodal configurations on a 3D U-Net, varying DRR undersampling $\alpha_{np}$ and alignment $\alpha_a$. The results show that multimodal CBCT+CT fusion generally improves segmentation, with the largest gains when CBCT quality is poor, though elastic misalignment can reduce performance; the model also exhibits some implicit registration, underscoring the potential need for preregistration for complex targets like tumors. The findings support practical applicability for real-time computer-assisted interventions and motivate future work on real paired data, integrated preregistration, and exploring late or hybrid fusion strategies.
Abstract
Cone-beam computed tomography (CBCT) is an important tool facilitating computer aided interventions, despite often suffering from artifacts that pose challenges for accurate interpretation. While the degraded image quality can affect downstream segmentation, the availability of high quality, preoperative scans represents potential for improvements. Here we consider a setting where preoperative CT and intraoperative CBCT scans are available, however, the alignment (registration) between the scans is imperfect. We propose a multimodal learning method that fuses roughly aligned CBCT and CT scans and investigate the effect of CBCT quality and misalignment on the final segmentation performance. For that purpose, we make use of a synthetically generated data set containing real CT and synthetic CBCT volumes. As an application scenario, we focus on liver and liver tumor segmentation. We show that the fusion of preoperative CT and simulated, intraoperative CBCT mostly improves segmentation performance (compared to using intraoperative CBCT only) and that even clearly misaligned preoperative data has the potential to improve segmentation performance.
