Table of Contents
Fetching ...

Holo-Dex: Teaching Dexterity with Immersive Mixed Reality

Sridhar Pandian Arunachalam, Irmak Güzey, Soumith Chintala, Lerrel Pinto

TL;DR

Holo-Dex addresses data collection bottlenecks in dexterous manipulation by enabling teachers to operate robots in immersive mixed reality and by learning from a small number of demonstrations. It combines BYOL-based visual self-supervised embeddings with non-parametric nearest-neighbor imitation to train dexterous policies that generalize to unseen objects. Across six tasks, demonstrations average $60s$ each and achieve high success rates on most tasks, with VINN outperforming baselines and zero-shot object generalization observed. The work demonstrates a practical, scalable path to rapid, generalizable dexterous skill acquisition and provides open-source resources for MR-based robot teaching.

Abstract

A fundamental challenge in teaching robots is to provide an effective interface for human teachers to demonstrate useful skills to a robot. This challenge is exacerbated in dexterous manipulation, where teaching high-dimensional, contact-rich behaviors often require esoteric teleoperation tools. In this work, we present Holo-Dex, a framework for dexterous manipulation that places a teacher in an immersive mixed reality through commodity VR headsets. The high-fidelity hand pose estimator onboard the headset is used to teleoperate the robot and collect demonstrations for a variety of general-purpose dexterous tasks. Given these demonstrations, we use powerful feature learning combined with non-parametric imitation to train dexterous skills. Our experiments on six common dexterous tasks, including in-hand rotation, spinning, and bottle opening, indicate that Holo-Dex can both collect high-quality demonstration data and train skills in a matter of hours. Finally, we find that our trained skills can exhibit generalization on objects not seen in training. Videos of Holo-Dex are available at https://holo-dex.github.io.

Holo-Dex: Teaching Dexterity with Immersive Mixed Reality

TL;DR

Holo-Dex addresses data collection bottlenecks in dexterous manipulation by enabling teachers to operate robots in immersive mixed reality and by learning from a small number of demonstrations. It combines BYOL-based visual self-supervised embeddings with non-parametric nearest-neighbor imitation to train dexterous policies that generalize to unseen objects. Across six tasks, demonstrations average each and achieve high success rates on most tasks, with VINN outperforming baselines and zero-shot object generalization observed. The work demonstrates a practical, scalable path to rapid, generalizable dexterous skill acquisition and provides open-source resources for MR-based robot teaching.

Abstract

A fundamental challenge in teaching robots is to provide an effective interface for human teachers to demonstrate useful skills to a robot. This challenge is exacerbated in dexterous manipulation, where teaching high-dimensional, contact-rich behaviors often require esoteric teleoperation tools. In this work, we present Holo-Dex, a framework for dexterous manipulation that places a teacher in an immersive mixed reality through commodity VR headsets. The high-fidelity hand pose estimator onboard the headset is used to teleoperate the robot and collect demonstrations for a variety of general-purpose dexterous tasks. Given these demonstrations, we use powerful feature learning combined with non-parametric imitation to train dexterous skills. Our experiments on six common dexterous tasks, including in-hand rotation, spinning, and bottle opening, indicate that Holo-Dex can both collect high-quality demonstration data and train skills in a matter of hours. Finally, we find that our trained skills can exhibit generalization on objects not seen in training. Videos of Holo-Dex are available at https://holo-dex.github.io.
Paper Structure (22 sections, 6 figures, 2 tables)

This paper contains 22 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: We present Holo-Dex, a framework that (a) collects high-quality demonstration data by placing human teachers in an immersive mixed reality world, and then (b) learns visual policies from a handful of these demonstrations to solve dexterous manipulation tasks.
  • Figure 2: Overview of Holo-Dex's teleoperation module. Given a hand pose in the VR interface, the controller streams the keypoint data to the robot's server which transforms and retargets the human hand key points to the Allegro Hand. Visual feedback of the teleoperated hand is then provided back to the VR Headset for real-time feedback.
  • Figure 3: Demonstration collection process for three of our tasks. For each task, the first row shows the user's perspective inside the VR Headset and the second row shows the corresponding robot hand configuration.
  • Figure 4: Successful rollouts of visual policies trained through Holo-Dex on our six dexterous tasks.
  • Figure 5: On the left, we depict the object present in demonstration data. On the right, we depict the rollouts produced by running our policies on objects that were not present in demonstration collection. Green boxes denote a successful rollout, while red boxes denote a failure. We see that policies learned by Holo-Dex are fairly robust to visually diverse novel objects without object-specific training.
  • ...and 1 more figures