Table of Contents
Fetching ...

Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning

Runyu Ding, Yuzhe Qin, Jiyue Zhu, Chengzhe Jia, Shiqi Yang, Ruihan Yang, Xiaojuan Qi, Xiaolong Wang

TL;DR

Problem: collecting high-quality demonstrations for dexterous, bimanual manipulation is hard due to control complexity and safety concerns. Approach: Bunny-VisionPro integrates Vision Pro-based tracking with three modules (hand retargeting, arm control with collision/singularity avoidance, and low-cost haptic feedback) to enable real-time teleoperation. Findings: on the Telekinesis benchmark, it achieves higher success and faster task completion; demonstrations improve downstream imitation learning, including better generalization and success on long-horizon tasks; user haptics improve usability. Impact: the system advances dexterous manipulation and imitation learning by enabling safe, intuitive, multi-hand teleoperation with affordable hardware.

Abstract

Teleoperation is a crucial tool for collecting human demonstrations, but controlling robots with bimanual dexterous hands remains a challenge. Existing teleoperation systems struggle to handle the complexity of coordinating two hands for intricate manipulations. We introduce Bunny-VisionPro, a real-time bimanual dexterous teleoperation system that leverages a VR headset. Unlike previous vision-based teleoperation systems, we design novel low-cost devices to provide haptic feedback to the operator, enhancing immersion. Our system prioritizes safety by incorporating collision and singularity avoidance while maintaining real-time performance through innovative designs. Bunny-VisionPro outperforms prior systems on a standard task suite, achieving higher success rates and reduced task completion times. Moreover, the high-quality teleoperation demonstrations improve downstream imitation learning performance, leading to better generalizability. Notably, Bunny-VisionPro enables imitation learning with challenging multi-stage, long-horizon dexterous manipulation tasks, which have rarely been addressed in previous work. Our system's ability to handle bimanual manipulations while prioritizing safety and real-time performance makes it a powerful tool for advancing dexterous manipulation and imitation learning.

Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning

TL;DR

Problem: collecting high-quality demonstrations for dexterous, bimanual manipulation is hard due to control complexity and safety concerns. Approach: Bunny-VisionPro integrates Vision Pro-based tracking with three modules (hand retargeting, arm control with collision/singularity avoidance, and low-cost haptic feedback) to enable real-time teleoperation. Findings: on the Telekinesis benchmark, it achieves higher success and faster task completion; demonstrations improve downstream imitation learning, including better generalization and success on long-horizon tasks; user haptics improve usability. Impact: the system advances dexterous manipulation and imitation learning by enabling safe, intuitive, multi-hand teleoperation with affordable hardware.

Abstract

Teleoperation is a crucial tool for collecting human demonstrations, but controlling robots with bimanual dexterous hands remains a challenge. Existing teleoperation systems struggle to handle the complexity of coordinating two hands for intricate manipulations. We introduce Bunny-VisionPro, a real-time bimanual dexterous teleoperation system that leverages a VR headset. Unlike previous vision-based teleoperation systems, we design novel low-cost devices to provide haptic feedback to the operator, enhancing immersion. Our system prioritizes safety by incorporating collision and singularity avoidance while maintaining real-time performance through innovative designs. Bunny-VisionPro outperforms prior systems on a standard task suite, achieving higher success rates and reduced task completion times. Moreover, the high-quality teleoperation demonstrations improve downstream imitation learning performance, leading to better generalizability. Notably, Bunny-VisionPro enables imitation learning with challenging multi-stage, long-horizon dexterous manipulation tasks, which have rarely been addressed in previous work. Our system's ability to handle bimanual manipulations while prioritizing safety and real-time performance makes it a powerful tool for advancing dexterous manipulation and imitation learning.
Paper Structure (21 sections, 6 equations, 8 figures, 6 tables, 2 algorithms)

This paper contains 21 sections, 6 equations, 8 figures, 6 tables, 2 algorithms.

Figures (8)

  • Figure 1: System Overview and Task Suits.(a) Hand poses captured by Apple Vision Pro are converted into robot motion control commands for real-time teleoperation. The robot provides sensory feedback, including vision and touch, to operators via Vision Pro and actuator-equipped finger cots. (b) We design diverse short-horizon (left-column) and long-horizon tasks (right-column) to evaluate teleoperation performance and its application for imitation learning.
  • Figure 2: Teleoperation System. The operator controls the robot hand and arm using finger and wrist poses, respectively. The system's visual and haptic feedback, combined with its four real-time capabilities, provides an intuitive and immersive VR experience for the operator.
  • Figure 3: Human Haptics Feedback Device. Tactile readings from FSR sensors located in the robot's fingertips undergo calibration and low-pass filtering. The processed tactile signals then drive ERM motors to deliver touch feedback to human.
  • Figure 4: User Study. It evaluates the impact of haptic feedback on success rates and time efficiency.
  • Figure 5: Sphere Modeling of Robot Arm Links for efficient and differentiable collision checking and avoidance.
  • ...and 3 more figures