Differentiable Rendering-based Pose Estimation for Surgical Robotic Instruments
Zekai Liang, Zih-Yun Chiu, Florian Richter, Michael C. Yip
TL;DR
This work tackles robust, markerless initialization of the camera-to-robot transform for cable-driven surgical robots where joint-angle readings are unreliable. It introduces a differentiable rendering framework that models the instrument with cylinders and operates in a four-DoF pose hypothesis space using a LookAt-based sampling strategy to perform a one-shot calibration. A composite objective, combining a silhouette rendering loss $\mathcal{L}_{render}$ and a geometry loss $\mathcal{L}_{geo}$, guides gradient-based optimization to rapidly converge to accurate pose estimates, even from partial visual information. Real-world experiments on the dVRK show superior calibration consistency over PnP baselines and effective open-loop manipulation, highlighting the approach’s practicality for improving tool tracking in robot-assisted surgery.
Abstract
Robot pose estimation is a challenging and crucial task for vision-based surgical robotic automation. Typical robotic calibration approaches, however, are not applicable to surgical robots, such as the da Vinci Research Kit (dVRK), due to joint angle measurement errors from cable-drives and the partially visible kinematic chain. Hence, previous works in surgical robotic automation used tracking algorithms to estimate the pose of the surgical tool in real-time and compensate for the joint angle errors. However, a big limitation of these previous tracking works is the initialization step which relied on only keypoints and SolvePnP. In this work, we fully explore the potential of geometric primitives beyond just keypoints with differentiable rendering, cylinders, and construct a versatile pose matching pipeline in a novel pose hypothesis space. We demonstrate the state-of-the-art performance of our single-shot calibration method with both calibration consistency and real surgical tasks. As a result, this marker-less calibration approach proves to be a robust and generalizable initialization step for surgical tool tracking.
