dARt Vinci: Egocentric Data Collection for Surgical Robot Learning at Scale

Yihao Liu; Yu-Chun Ku; Jiaming Zhang; Hao Ding; Peter Kazanzides; Mehran Armand

dARt Vinci: Egocentric Data Collection for Surgical Robot Learning at Scale

Yihao Liu, Yu-Chun Ku, Jiaming Zhang, Hao Ding, Peter Kazanzides, Mehran Armand

TL;DR

The paper tackles data scarcity in RMIS by introducing dARt Vinci, an egocentric AR data collection platform that uses a high-fidelity simulator to collect teleoperation demonstrations without a physical robot. It integrates AR hand tracking with a neural-inference-ready pipeline, mapping hand gestures to da Vinci PSM commands and recording compact JSON state data that can be replayed in IsaacSim. Ten primitive RMIS tasks are used to benchmark data collection efficiency, and a user study shows 41% higher data throughput, 10% shorter experiment times, and 400x reduction in storage, with a doubling of sampling frequency. The work enables scalable data collection for imitation and reinforcement learning in surgical robotics, reducing hardware barriers and enabling broader participation.

Abstract

Data scarcity has long been an issue in the robot learning community. Particularly, in safety-critical domains like surgical applications, obtaining high-quality data can be especially difficult. It poses challenges to researchers seeking to exploit recent advancements in reinforcement learning and imitation learning, which have greatly improved generalizability and enabled robots to conduct tasks autonomously. We introduce dARt Vinci, a scalable data collection platform for robot learning in surgical settings. The system uses Augmented Reality (AR) hand tracking and a high-fidelity physics engine to capture subtle maneuvers in primitive surgical tasks: By eliminating the need for a physical robot setup and providing flexibility in terms of time, space, and hardware resources-such as multiview sensors and actuators-specialized simulation is a viable alternative. At the same time, AR allows the robot data collection to be more egocentric, supported by its body tracking and content overlaying capabilities. Our user study confirms the proposed system's efficiency and usability, where we use widely-used primitive tasks for training teleoperation with da Vinci surgical robots. Data throughput improves across all tasks compared to real robot settings by 41% on average. The total experiment time is reduced by an average of 10%. The temporal demand in the task load survey is improved. These gains are statistically significant. Additionally, the collected data is over 400 times smaller in size, requiring far less storage while achieving double the frequency.

dARt Vinci: Egocentric Data Collection for Surgical Robot Learning at Scale

TL;DR

Abstract

dARt Vinci: Egocentric Data Collection for Surgical Robot Learning at Scale

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)