Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning

Runyu Ding; Yuzhe Qin; Jiyue Zhu; Chengzhe Jia; Shiqi Yang; Ruihan Yang; Xiaojuan Qi; Xiaolong Wang

Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning

Runyu Ding, Yuzhe Qin, Jiyue Zhu, Chengzhe Jia, Shiqi Yang, Ruihan Yang, Xiaojuan Qi, Xiaolong Wang

TL;DR

Problem: collecting high-quality demonstrations for dexterous, bimanual manipulation is hard due to control complexity and safety concerns. Approach: Bunny-VisionPro integrates Vision Pro-based tracking with three modules (hand retargeting, arm control with collision/singularity avoidance, and low-cost haptic feedback) to enable real-time teleoperation. Findings: on the Telekinesis benchmark, it achieves higher success and faster task completion; demonstrations improve downstream imitation learning, including better generalization and success on long-horizon tasks; user haptics improve usability. Impact: the system advances dexterous manipulation and imitation learning by enabling safe, intuitive, multi-hand teleoperation with affordable hardware.

Abstract

Teleoperation is a crucial tool for collecting human demonstrations, but controlling robots with bimanual dexterous hands remains a challenge. Existing teleoperation systems struggle to handle the complexity of coordinating two hands for intricate manipulations. We introduce Bunny-VisionPro, a real-time bimanual dexterous teleoperation system that leverages a VR headset. Unlike previous vision-based teleoperation systems, we design novel low-cost devices to provide haptic feedback to the operator, enhancing immersion. Our system prioritizes safety by incorporating collision and singularity avoidance while maintaining real-time performance through innovative designs. Bunny-VisionPro outperforms prior systems on a standard task suite, achieving higher success rates and reduced task completion times. Moreover, the high-quality teleoperation demonstrations improve downstream imitation learning performance, leading to better generalizability. Notably, Bunny-VisionPro enables imitation learning with challenging multi-stage, long-horizon dexterous manipulation tasks, which have rarely been addressed in previous work. Our system's ability to handle bimanual manipulations while prioritizing safety and real-time performance makes it a powerful tool for advancing dexterous manipulation and imitation learning.

Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning

TL;DR

Abstract

Paper Structure (21 sections, 6 equations, 8 figures, 6 tables, 2 algorithms)

This paper contains 21 sections, 6 equations, 8 figures, 6 tables, 2 algorithms.

Introduction
Related Work
Teleoperation System
Overview
Robot Hand Motion Retargeting
Robot Arm Motion Control
Human Haptic Feedback Device
System Evaluation
Profiling Results
Real Robot Teleoperation Experiments
User Study of Haptic Feedback
Imitation Learning
Main Results.
Tactile Data as Policy Input
Conclusion and Limitation
...and 6 more sections

Figures (8)

Figure 1: System Overview and Task Suits.(a) Hand poses captured by Apple Vision Pro are converted into robot motion control commands for real-time teleoperation. The robot provides sensory feedback, including vision and touch, to operators via Vision Pro and actuator-equipped finger cots. (b) We design diverse short-horizon (left-column) and long-horizon tasks (right-column) to evaluate teleoperation performance and its application for imitation learning.
Figure 2: Teleoperation System. The operator controls the robot hand and arm using finger and wrist poses, respectively. The system's visual and haptic feedback, combined with its four real-time capabilities, provides an intuitive and immersive VR experience for the operator.
Figure 3: Human Haptics Feedback Device. Tactile readings from FSR sensors located in the robot's fingertips undergo calibration and low-pass filtering. The processed tactile signals then drive ERM motors to deliver touch feedback to human.
Figure 4: User Study. It evaluates the impact of haptic feedback on success rates and time efficiency.
Figure 5: Sphere Modeling of Robot Arm Links for efficient and differentiable collision checking and avoidance.
...and 3 more figures

Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning

TL;DR

Abstract

Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (8)