Table of Contents
Fetching ...

Multimodal Learning With Intraoperative CBCT & Variably Aligned Preoperative CT Data To Improve Segmentation

Maximilian E. Tschuchnig, Philipp Steininger, Michael Gadermayr

TL;DR

This work addresses segmentation in intraoperative CBCT, which is often degraded by artifacts, by fusing roughly aligned preoperative CT data using an early-fusion multimodal approach. Using a LiTS-based dataset with synthetic CBCT generated via DRRs and misalignment modeled by affine and elastic deformations, the authors evaluate 45 multimodal configurations on a 3D U-Net, varying DRR undersampling $\alpha_{np}$ and alignment $\alpha_a$. The results show that multimodal CBCT+CT fusion generally improves segmentation, with the largest gains when CBCT quality is poor, though elastic misalignment can reduce performance; the model also exhibits some implicit registration, underscoring the potential need for preregistration for complex targets like tumors. The findings support practical applicability for real-time computer-assisted interventions and motivate future work on real paired data, integrated preregistration, and exploring late or hybrid fusion strategies.

Abstract

Cone-beam computed tomography (CBCT) is an important tool facilitating computer aided interventions, despite often suffering from artifacts that pose challenges for accurate interpretation. While the degraded image quality can affect downstream segmentation, the availability of high quality, preoperative scans represents potential for improvements. Here we consider a setting where preoperative CT and intraoperative CBCT scans are available, however, the alignment (registration) between the scans is imperfect. We propose a multimodal learning method that fuses roughly aligned CBCT and CT scans and investigate the effect of CBCT quality and misalignment on the final segmentation performance. For that purpose, we make use of a synthetically generated data set containing real CT and synthetic CBCT volumes. As an application scenario, we focus on liver and liver tumor segmentation. We show that the fusion of preoperative CT and simulated, intraoperative CBCT mostly improves segmentation performance (compared to using intraoperative CBCT only) and that even clearly misaligned preoperative data has the potential to improve segmentation performance.

Multimodal Learning With Intraoperative CBCT & Variably Aligned Preoperative CT Data To Improve Segmentation

TL;DR

This work addresses segmentation in intraoperative CBCT, which is often degraded by artifacts, by fusing roughly aligned preoperative CT data using an early-fusion multimodal approach. Using a LiTS-based dataset with synthetic CBCT generated via DRRs and misalignment modeled by affine and elastic deformations, the authors evaluate 45 multimodal configurations on a 3D U-Net, varying DRR undersampling and alignment . The results show that multimodal CBCT+CT fusion generally improves segmentation, with the largest gains when CBCT quality is poor, though elastic misalignment can reduce performance; the model also exhibits some implicit registration, underscoring the potential need for preregistration for complex targets like tumors. The findings support practical applicability for real-time computer-assisted interventions and motivate future work on real paired data, integrated preregistration, and exploring late or hybrid fusion strategies.

Abstract

Cone-beam computed tomography (CBCT) is an important tool facilitating computer aided interventions, despite often suffering from artifacts that pose challenges for accurate interpretation. While the degraded image quality can affect downstream segmentation, the availability of high quality, preoperative scans represents potential for improvements. Here we consider a setting where preoperative CT and intraoperative CBCT scans are available, however, the alignment (registration) between the scans is imperfect. We propose a multimodal learning method that fuses roughly aligned CBCT and CT scans and investigate the effect of CBCT quality and misalignment on the final segmentation performance. For that purpose, we make use of a synthetically generated data set containing real CT and synthetic CBCT volumes. As an application scenario, we focus on liver and liver tumor segmentation. We show that the fusion of preoperative CT and simulated, intraoperative CBCT mostly improves segmentation performance (compared to using intraoperative CBCT only) and that even clearly misaligned preoperative data has the potential to improve segmentation performance.
Paper Structure (9 sections, 4 figures, 1 table)

This paper contains 9 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: Multimodal model configuration. After fusing the intraoperative CBCT and preoperative CT (early fusion), the data was processed by the shown unet, segmenting liver and liver tumors.
  • Figure 2: Data generation process: after centering the original CT volumes around the liver (using the liver segmentations), $\alpha_{np}$ projections were simulated. Finally, CBCT were simulated and aligned with the original CT and masks, in order to fit them to the CBCT field-of-view.
  • Figure 3: Results of the random affine augmentation with differing augmentation factor $\alpha_a \in \{0, 0.125, 0.25, 0.5, 1\}$ of $4$ different volumes.
  • Figure 4: Results of the random elastic augmentation with differing augmentation factor $\alpha_a \in \{0, 0.125, 0.25, 0.5, 1\}$ of $4$ different volumes.