Table of Contents
Fetching ...

TeleGate: Whole-Body Humanoid Teleoperation via Gated Expert Selection with Motion Prior

Jie Li, Bing Tang, Feng Wu, Rongyun Cao

TL;DR

TeleGate tackles real-time whole-body humanoid teleoperation by preserving the full capabilities of multiple domain experts through gated expert selection, avoiding the degradation seen in distillation-based fusion. It adds a VAE-based motion prediction prior to compensate for the lack of future references, enabling anticipatory control for dynamic motions. Trained with only 2.5 hours of inertial-motion data and validated in simulation and on a Unitree G1, TeleGate achieves superior tracking accuracy and higher success rates across diverse motions including running, jumping, and fall recovery. The system demonstrated strong sim-to-real transfer and robust real-world deployment, offering a scalable approach for agile, high-precision teleoperation in unstructured environments.

Abstract

Real-time whole-body teleoperation is a critical method for humanoid robots to perform complex tasks in unstructured environments. However, developing a unified controller that robustly supports diverse human motions remains a significant challenge. Existing methods typically distill multiple expert policies into a single general policy, which often inevitably leads to performance degradation, particularly on highly dynamic motions. This paper presents TeleGate, a unified whole-body teleoperation framework for humanoid robots that achieves high-precision tracking across various motions while avoiding the performance loss inherent in knowledge distillation. Our key idea is to preserve the full capability of domain-specific expert policies by training a lightweight gating network, which dynamically activates experts in real-time based on proprioceptive states and reference trajectories. Furthermore, to compensate for the absence of future reference trajectories in real-time teleoperation, we introduce a VAE-based motion prior module that extracts implicit future motion intent from historical observations, enabling anticipatory control for motions requiring prediction such as jumping and standing up. We conducted empirical evaluations in simulation and also deployed our technique on the Unitree G1 humanoid robot. Using only 2.5 hours of motion capture data for training, our TeleGate achieves high-precision real-time teleoperation across diverse dynamic motions (e.g., running, fall recovery, and jumping), significantly outperforming the baseline methods in both tracking accuracy and success rate.

TeleGate: Whole-Body Humanoid Teleoperation via Gated Expert Selection with Motion Prior

TL;DR

TeleGate tackles real-time whole-body humanoid teleoperation by preserving the full capabilities of multiple domain experts through gated expert selection, avoiding the degradation seen in distillation-based fusion. It adds a VAE-based motion prediction prior to compensate for the lack of future references, enabling anticipatory control for dynamic motions. Trained with only 2.5 hours of inertial-motion data and validated in simulation and on a Unitree G1, TeleGate achieves superior tracking accuracy and higher success rates across diverse motions including running, jumping, and fall recovery. The system demonstrated strong sim-to-real transfer and robust real-world deployment, offering a scalable approach for agile, high-precision teleoperation in unstructured environments.

Abstract

Real-time whole-body teleoperation is a critical method for humanoid robots to perform complex tasks in unstructured environments. However, developing a unified controller that robustly supports diverse human motions remains a significant challenge. Existing methods typically distill multiple expert policies into a single general policy, which often inevitably leads to performance degradation, particularly on highly dynamic motions. This paper presents TeleGate, a unified whole-body teleoperation framework for humanoid robots that achieves high-precision tracking across various motions while avoiding the performance loss inherent in knowledge distillation. Our key idea is to preserve the full capability of domain-specific expert policies by training a lightweight gating network, which dynamically activates experts in real-time based on proprioceptive states and reference trajectories. Furthermore, to compensate for the absence of future reference trajectories in real-time teleoperation, we introduce a VAE-based motion prior module that extracts implicit future motion intent from historical observations, enabling anticipatory control for motions requiring prediction such as jumping and standing up. We conducted empirical evaluations in simulation and also deployed our technique on the Unitree G1 humanoid robot. Using only 2.5 hours of motion capture data for training, our TeleGate achieves high-precision real-time teleoperation across diverse dynamic motions (e.g., running, fall recovery, and jumping), significantly outperforming the baseline methods in both tracking accuracy and success rate.
Paper Structure (43 sections, 14 equations, 5 figures, 7 tables)

This paper contains 43 sections, 14 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Whole-body teleoperation of the Unitree G1 humanoid robot using inertial motion capture equipment: (a) Grasping and placing toys into a basket; (b) Standing long jump; (c) Prone position stand-up; (d) Kicking a ball.
  • Figure 2: Framework overview. Our method consists of three stages: (I) Data collection and preprocessing using inertial motion capture; (II) Expert policy training with VAE-based motion prediction prior; (III) Gating network training for dynamic expert selection during inference.
  • Figure 3: Expert switching analysis during continuous motion. Top: Key frame sequence; Bottom: Target joint position curves for several joints, with different background colors representing different experts.
  • Figure 4: Trajectory reconstruction performance of VAE motion prediction prior. Left: Historical reference trajectory sequence; Right: Ground truth future trajectory (blue) and VAE-predicted future trajectory (green). From top to bottom: stand-up, lie-down, running, and walking.
  • Figure 5: More real-world teleoperation skills: (a) sitting; (b) walking; (c) running; (d) lying prone; (e) lateral slide; (f) single-leg hop.