Deep Learning-based Point Cloud Registration for Augmented Reality-guided Surgery
Maximilian Weber, Daniel Wild, Jens Kleesiek, Jan Egger, Christina Gsaxner
TL;DR
The paper tackles image-to-patient registration for AR-guided surgery by evaluating deep learning–based point cloud registration on a cross-source dataset created from CT scans and HoloLens 2 depth captures. It benchmarks three DL methods—FMR, PointNetLK Revisited, and DGR—against a traditional global registration plus ICP pipeline, using a dataset of 30 CT-to-AR head-pair samples. The results show that classic global+ICP generally outperforms the DL methods, though Deep Global Registration with fine-tuning shows promise and offers repeatable behavior; FMR and PointNetLK struggle with cross-source disparities. The work highlights the need for domain-specific data and potential hybrid strategies that combine DL initialization with ICP refinement, advancing toward robust AR-guided surgery workflows.
Abstract
Point cloud registration aligns 3D point clouds using spatial transformations. It is an important task in computer vision, with applications in areas such as augmented reality (AR) and medical imaging. This work explores the intersection of two research trends: the integration of AR into image-guided surgery and the use of deep learning for point cloud registration. The main objective is to evaluate the feasibility of applying deep learning-based point cloud registration methods for image-to-patient registration in augmented reality-guided surgery. We created a dataset of point clouds from medical imaging and corresponding point clouds captured with a popular AR device, the HoloLens 2. We evaluate three well-established deep learning models in registering these data pairs. While we find that some deep learning methods show promise, we show that a conventional registration pipeline still outperforms them on our challenging dataset.
