Table of Contents
Fetching ...

US-X Complete: A Multi-Modal Approach to Anatomical 3D Shape Recovery

Miruna-Alexandra Gafencu, Yordanka Velikova, Nassir Navab, Mohammad Farid Azampour

TL;DR

US-X Complete introduces a first multi-modal spine shape completion framework that fuses intraoperative ultrasound with a single lateral X-ray to recover full 3D vertebral anatomy. The method uses a two-stage coarse-to-fine variational autoencoder with Early Fusion and Late Fusion to integrate ultrasound and X-ray information in a unified 3D representation, trained on synthetic VerSe2020-derived data and validated on spine phantoms. Results show statistically significant improvements over ultrasound-only approaches in vertebral arch and body reconstruction, suggesting potential for clinically translation without preoperative CT registration. The work highlights robust multi-modal integration, efficient inference, and potential to enhance ultrasound-guided spinal navigation.

Abstract

Ultrasound offers a radiation-free, cost-effective solution for real-time visualization of spinal landmarks, paraspinal soft tissues and neurovascular structures, making it valuable for intraoperative guidance during spinal procedures. However, ultrasound suffers from inherent limitations in visualizing complete vertebral anatomy, in particular vertebral bodies, due to acoustic shadowing effects caused by bone. In this work, we present a novel multi-modal deep learning method for completing occluded anatomical structures in 3D ultrasound by leveraging complementary information from a single X-ray image. To enable training, we generate paired training data consisting of: (1) 2D lateral vertebral views that simulate X-ray scans, and (2) 3D partial vertebrae representations that mimic the limited visibility and occlusions encountered during ultrasound spine imaging. Our method integrates morphological information from both imaging modalities and demonstrates significant improvements in vertebral reconstruction (p < 0.001) compared to state of art in 3D ultrasound vertebral completion. We perform phantom studies as an initial step to future clinical translation, and achieve a more accurate, complete volumetric lumbar spine visualization overlayed on the ultrasound scan without the need for registration with preoperative modalities such as computed tomography. This demonstrates that integrating a single X-ray projection mitigates ultrasound's key limitation while preserving its strengths as the primary imaging modality. Code and data can be found at https://github.com/miruna20/US-X-Complete

US-X Complete: A Multi-Modal Approach to Anatomical 3D Shape Recovery

TL;DR

US-X Complete introduces a first multi-modal spine shape completion framework that fuses intraoperative ultrasound with a single lateral X-ray to recover full 3D vertebral anatomy. The method uses a two-stage coarse-to-fine variational autoencoder with Early Fusion and Late Fusion to integrate ultrasound and X-ray information in a unified 3D representation, trained on synthetic VerSe2020-derived data and validated on spine phantoms. Results show statistically significant improvements over ultrasound-only approaches in vertebral arch and body reconstruction, suggesting potential for clinically translation without preoperative CT registration. The work highlights robust multi-modal integration, efficient inference, and potential to enhance ultrasound-guided spinal navigation.

Abstract

Ultrasound offers a radiation-free, cost-effective solution for real-time visualization of spinal landmarks, paraspinal soft tissues and neurovascular structures, making it valuable for intraoperative guidance during spinal procedures. However, ultrasound suffers from inherent limitations in visualizing complete vertebral anatomy, in particular vertebral bodies, due to acoustic shadowing effects caused by bone. In this work, we present a novel multi-modal deep learning method for completing occluded anatomical structures in 3D ultrasound by leveraging complementary information from a single X-ray image. To enable training, we generate paired training data consisting of: (1) 2D lateral vertebral views that simulate X-ray scans, and (2) 3D partial vertebrae representations that mimic the limited visibility and occlusions encountered during ultrasound spine imaging. Our method integrates morphological information from both imaging modalities and demonstrates significant improvements in vertebral reconstruction (p < 0.001) compared to state of art in 3D ultrasound vertebral completion. We perform phantom studies as an initial step to future clinical translation, and achieve a more accurate, complete volumetric lumbar spine visualization overlayed on the ultrasound scan without the need for registration with preoperative modalities such as computed tomography. This demonstrates that integrating a single X-ray projection mitigates ultrasound's key limitation while preserving its strengths as the primary imaging modality. Code and data can be found at https://github.com/miruna20/US-X-Complete

Paper Structure

This paper contains 20 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Overview of the proposed method: (a) Data Generation Pipeline: Synthetic training data is derived from annotated CT scans, simulating ultrasound-consistent partial vertebral observations to mimic acoustic shadowing. Simultaneously, 2D lateral X-ray projections of 3D vertebral segmentations are generated. These multi-modal observations are merged into an anatomically aligned 3D point cloud representation.(b) The aligned observations undergo a two-stage completion process. The coarse stage extracts global features from both modalities to generate a vertebral template, which is then refined with detailed features from ultrasound and X-ray data to produce the final complete shape.
  • Figure 2: The coarse-stage network is trained to reconstruct the full vertebral shape from ground truth data, which implicitly teaches it a shape prior for lumbar vertebrae. Simultaneously, it learns to complete the shape conditioned on multi-modal partial observations from ultrasound and X-ray. As a result, the network outputs anatomically plausible coarse vertebral templates that integrate prior knowledge with image-based observations.
  • Figure 3: Data acquisition and pre-processing for validating the proposed method using two spine phantoms. We acquire registered ultrasound and X-ray scans and integrate both modalities into our unified 3D point cloud representation before feeding it into the multi-model shape completion pipeline.
  • Figure 4: Qualitative comparison of previously introduced ultrasound-based vertebral shape completion method gafencu2024shape versus our proposed multi-modal approach on two spine phantoms.
  • Figure 5: Two lumbar spine phantoms utilized to conduct validation of our method within a clinical-like setup.