Table of Contents
Fetching ...

Localising under the drape: proprioception in the era of distributed surgical robotic system

Martin Huber, Nicola A. Cavalcanti, Ayoob Davoodi, Ruixuan Li, Christopher E. Mower, Fabio Carrillo, Christoph J. Laux, Francois Teyssere, Thibault Chandanson, Antoine Harlé, Elie Saghbiny, Mazda Farshad, Guillaume Morel, Emmanuel Vander Poorten, Philipp Fürnstahl, Sébastien Ourselin, Christos Bergeles, Tom Vercauteren

TL;DR

To the knowledge, this is the first demonstration of marker-free proprioception for fully draped surgical robots, reducing setup complexity, enhancing safety, and paving the way toward modular and autonomous robotic surgery.

Abstract

Despite their mechanical sophistication, surgical robots remain blind to their surroundings. This lack of spatial awareness causes collisions, system recoveries, and workflow disruptions, issues that will intensify with the introduction of distributed robots with independent interacting arms. Existing tracking systems rely on bulky infrared cameras and reflective markers, providing only limited views of the surgical scene and adding hardware burden in crowded operating rooms. We present a marker-free proprioception method that enables precise localisation of surgical robots under their sterile draping despite associated obstruction of visual cues. Our method solely relies on lightweight stereo-RGB cameras and novel transformer-based deep learning models. It builds on the largest multi-centre spatial robotic surgery dataset to date (1.4M self-annotated images from human cadaveric and preclinical in vivo studies). By tracking the entire robot and surgical scene, rather than individual markers, our approach provides a holistic view robust to occlusions, supporting surgical scene understanding and context-aware control. We demonstrate an example of potential clinical benefits during in vivo breathing compensation with access to tissue dynamics, unobservable under state of the art tracking, and accurately locate in multi-robot systems for future intelligent interaction. In addition, and compared with existing systems, our method eliminates markers and improves tracking visibility by 25%. To our knowledge, this is the first demonstration of marker-free proprioception for fully draped surgical robots, reducing setup complexity, enhancing safety, and paving the way toward modular and autonomous robotic surgery.

Localising under the drape: proprioception in the era of distributed surgical robotic system

TL;DR

To the knowledge, this is the first demonstration of marker-free proprioception for fully draped surgical robots, reducing setup complexity, enhancing safety, and paving the way toward modular and autonomous robotic surgery.

Abstract

Despite their mechanical sophistication, surgical robots remain blind to their surroundings. This lack of spatial awareness causes collisions, system recoveries, and workflow disruptions, issues that will intensify with the introduction of distributed robots with independent interacting arms. Existing tracking systems rely on bulky infrared cameras and reflective markers, providing only limited views of the surgical scene and adding hardware burden in crowded operating rooms. We present a marker-free proprioception method that enables precise localisation of surgical robots under their sterile draping despite associated obstruction of visual cues. Our method solely relies on lightweight stereo-RGB cameras and novel transformer-based deep learning models. It builds on the largest multi-centre spatial robotic surgery dataset to date (1.4M self-annotated images from human cadaveric and preclinical in vivo studies). By tracking the entire robot and surgical scene, rather than individual markers, our approach provides a holistic view robust to occlusions, supporting surgical scene understanding and context-aware control. We demonstrate an example of potential clinical benefits during in vivo breathing compensation with access to tissue dynamics, unobservable under state of the art tracking, and accurately locate in multi-robot systems for future intelligent interaction. In addition, and compared with existing systems, our method eliminates markers and improves tracking visibility by 25%. To our knowledge, this is the first demonstration of marker-free proprioception for fully draped surgical robots, reducing setup complexity, enhancing safety, and paving the way toward modular and autonomous robotic surgery.

Paper Structure

This paper contains 37 sections, 13 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Localising under the drape. (A) Surgical setup during the zh porcine in vivo study. Following current clinical practise, a stereo infrared tracking camera was used to track spine breathing patterns at a distally screwed fiducial infrared marker (vertebrae T7). (B) View of the bedside-mounted stereo- camera during drilling at vertebrae T15 (left). The stereo- camera also tracked breathing patterns, however, at anatomy affixed AprilTags. Both cameras had exact knowledge of the surgical robot's location, through tool infrared markers, and via the proposed sdr-based \ref{['sec:results.marker_free_surgical_robot_localisation_in_preclinical_workflow']} (Fig. \ref{['fig:results.method_overview']}), respectively. This information allowed for referencing spine applied forces to the camera-observed breathing, i.e. proprioception, and was used for \ref{['sec:results.breathing_compensated_drilling_in_robotic_porcine_in_vivo_spine_surgery']}, see Fig. \ref{['fig:results.breathing']} for detailed data.
  • Figure 2: sdr. The proposed sdr algorithm maximises the overlay of robot segmentations with the projection of a virtual robot model lbr_stack, thus yielding the robot location. Initially, SAM 2 sam2 was used to segment and subsequently locate the undraped surgical robot under preclinical workflow conditions. The robot was then draped in place and data got collected for \ref{['sec:results.drape_and_occlusion_invariant_segmentation_of_surgical_robots']}, see Fig. \ref{['fig:results.transition']}. For surgical ldn mock spine surgery localisation benchmark, see Fig. \ref{['fig:results.localisation_errors_benchmark']}. The displayed data was obtained during the zh human GAX cadaveric studies. The surgeon point cloud to the right was observed from the ceil-mounted stereo- camera and co-referenced to the surgical robot.
  • Figure 3: Data collection and training scheme for drape- and occlusion-invariant segmentation of surgical robots. (Undraped localisation) Surgical robots were localised by the proposed sdr algorithm to kinematically generate ground-truth renders. Displayed samples from zh human GAX cadaveric, porcine ev / iv / pev, and ldn datasets (details in table \ref{['tab:results.dataset']}). (Drape- and occlusion-invariant segmentation) Transitioning the robot segmentor (also refer Fig. \ref{['fig:results.method_overview']}) to clinically realistic workflow conditions, i.e. \ref{['sec:results.drape_and_occlusion_invariant_segmentation_of_surgical_robots']} was achieved on thus established ground-truth. The novel cmm augmentation emulated multi-robot setups. For qualitative segmentation results refer Fig. \ref{['fig:results.segmentations']}.
  • Figure 4: ldn mock spine surgery localisation benchmark. Reprojection error medians, and $25\%$ (Q1) / $75\%$ (Q3) quantiles. For all sdr-based methods (without in-context prior Fig. \ref{['fig:results.method_overview']}, with in-context prior Fig. \ref{['fig:results.render_prior']}), the reported error presents an upper bound, as described in \ref{['sec:results.marker_free_surgical_robot_localisation_in_preclinical_workflow']}. Segmented via the MIT-B5-based three / four channel models (without / with in-context prior, table \ref{['tab:results.iou']}) unless indicated as SAM 2. Dashed lines present undraped measurements, solid lines draped measurements, respectively. Benchmark data shown in figure \ref{['fig:results.sie_benchmark_data']}. For twelve robot configurations, (SAM 2) achieved a median error of $0.9\,/\,19\,\text{mm}$ (undraped / draped), $0.97\,/\,1.33\,\text{mm}$ (confidently within $2\,\text{mm}$), and marker-based (known tool) $0.48\,\text{mm}$. For repeatability errors on par multi-robot in vivo test dataset, refer Fig. \ref{['fig:results.localisation_errors_clinical']}.
  • Figure 5: Visually observed and kinematically tracked motions during drilling at vertebrae T15 (left). Demonstrates \ref{['sec:results.breathing_compensated_drilling_in_robotic_porcine_in_vivo_spine_surgery']} during the zh in vivo study. Transitions from breathing compensated drilling (pre- and post-contact) to drill stop to drill retraction. Distal AprilTag affixed onto fiducial distal infrared marker (Fig. \ref{['fig:results.overview']}B) and referenced via \ref{['sec:results.proprioceptive_breathing_motion_estimation_in_the_robot_reference_frame']}. Quantitative measures are provided in table \ref{['tab:results.breathing_amplitudes']} (isolated breathing amplitudes) and table \ref{['tab:results.breathing_dynamics']} (dynamic spine and tissue displacements / relaxations).
  • ...and 4 more figures