Table of Contents
Fetching ...

MirrorLimb: Implementing hand pose acquisition and robot teleoperation based on RealMirror

Cong Tai, Hansheng Wu, Haixu Long, Zhengbin Long, Zhaoyu Zheng, Haodong Xiang, Tao Shen

TL;DR

This work introduces MirrorLimb, a PICO-based framework for low-cost, real-time hand pose acquisition and robot teleoperation integrated with the RealMirror ecosystem to support Vision-Language-Action (VLA) research. It employs a dual-channel data pipeline comprising a WebXR-based handle stream and an OpenXR/Unity-based hand-gesture stream, both normalized to IsaacSim coordinates to enable stable trajectories and rapid dataset generation. A kinematic post-processing stage uses defined thresholds and an executable frame flag, with $d_t=\|\Delta \mathbf{p}_t\|$, $\Delta \theta_{t,i}$, $\Delta \phi_{t,i}$, and thresholds $\delta_1$, $\delta_2$, $\epsilon_1$, $\epsilon_2$ to suppress jitter and jumps, improving teleoperation reliability. The system provides standardized gesture and handle data schemas for integration with RealMirror's VLA workflow and supports cost-efficient XR devices like PICO, enhancing accessibility for dexterous manipulation studies and VLA data collection across diverse end-effectors. Overall, MirrorLimb lowers hardware barriers and accelerates the development of embodied AI and VLA datasets within IsaacSim and RealMirror.

Abstract

In this work, we present a PICO-based robot remote operating framework that enables low-cost, real-time acquisition of hand motion and pose data, outperforming mainstream visual tracking and motion capture solutions in terms of cost-effectiveness. The framework is natively compatible with the RealMirror ecosystem, offering ready-to-use functionality for stable and precise robotic trajectory recording within the Isaac simulation environment, thereby facilitating the construction of Vision-Language-Action (VLA) datasets. Additionally, the system supports real-time teleoperation of a variety of end-effector-equipped robots, including dexterous hands and robotic grippers. This work aims to lower the technical barriers in the study of upper-limb robotic manipulation, thereby accelerating advancements in VLA-related research.

MirrorLimb: Implementing hand pose acquisition and robot teleoperation based on RealMirror

TL;DR

This work introduces MirrorLimb, a PICO-based framework for low-cost, real-time hand pose acquisition and robot teleoperation integrated with the RealMirror ecosystem to support Vision-Language-Action (VLA) research. It employs a dual-channel data pipeline comprising a WebXR-based handle stream and an OpenXR/Unity-based hand-gesture stream, both normalized to IsaacSim coordinates to enable stable trajectories and rapid dataset generation. A kinematic post-processing stage uses defined thresholds and an executable frame flag, with , , , and thresholds , , , to suppress jitter and jumps, improving teleoperation reliability. The system provides standardized gesture and handle data schemas for integration with RealMirror's VLA workflow and supports cost-efficient XR devices like PICO, enhancing accessibility for dexterous manipulation studies and VLA data collection across diverse end-effectors. Overall, MirrorLimb lowers hardware barriers and accelerates the development of embodied AI and VLA datasets within IsaacSim and RealMirror.

Abstract

In this work, we present a PICO-based robot remote operating framework that enables low-cost, real-time acquisition of hand motion and pose data, outperforming mainstream visual tracking and motion capture solutions in terms of cost-effectiveness. The framework is natively compatible with the RealMirror ecosystem, offering ready-to-use functionality for stable and precise robotic trajectory recording within the Isaac simulation environment, thereby facilitating the construction of Vision-Language-Action (VLA) datasets. Additionally, the system supports real-time teleoperation of a variety of end-effector-equipped robots, including dexterous hands and robotic grippers. This work aims to lower the technical barriers in the study of upper-limb robotic manipulation, thereby accelerating advancements in VLA-related research.

Paper Structure

This paper contains 11 sections, 2 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Handle and gesture teleoperation demonstration.
  • Figure 2: The mapping and coordinate system representation of hand tracking.
  • Figure 3: Robot teleoperation system framework based on RealMirror ecosystem.