Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning
Runyu Ding, Yuzhe Qin, Jiyue Zhu, Chengzhe Jia, Shiqi Yang, Ruihan Yang, Xiaojuan Qi, Xiaolong Wang
TL;DR
Problem: collecting high-quality demonstrations for dexterous, bimanual manipulation is hard due to control complexity and safety concerns. Approach: Bunny-VisionPro integrates Vision Pro-based tracking with three modules (hand retargeting, arm control with collision/singularity avoidance, and low-cost haptic feedback) to enable real-time teleoperation. Findings: on the Telekinesis benchmark, it achieves higher success and faster task completion; demonstrations improve downstream imitation learning, including better generalization and success on long-horizon tasks; user haptics improve usability. Impact: the system advances dexterous manipulation and imitation learning by enabling safe, intuitive, multi-hand teleoperation with affordable hardware.
Abstract
Teleoperation is a crucial tool for collecting human demonstrations, but controlling robots with bimanual dexterous hands remains a challenge. Existing teleoperation systems struggle to handle the complexity of coordinating two hands for intricate manipulations. We introduce Bunny-VisionPro, a real-time bimanual dexterous teleoperation system that leverages a VR headset. Unlike previous vision-based teleoperation systems, we design novel low-cost devices to provide haptic feedback to the operator, enhancing immersion. Our system prioritizes safety by incorporating collision and singularity avoidance while maintaining real-time performance through innovative designs. Bunny-VisionPro outperforms prior systems on a standard task suite, achieving higher success rates and reduced task completion times. Moreover, the high-quality teleoperation demonstrations improve downstream imitation learning performance, leading to better generalizability. Notably, Bunny-VisionPro enables imitation learning with challenging multi-stage, long-horizon dexterous manipulation tasks, which have rarely been addressed in previous work. Our system's ability to handle bimanual manipulations while prioritizing safety and real-time performance makes it a powerful tool for advancing dexterous manipulation and imitation learning.
