Table of Contents
Fetching ...

CLONE: Closed-Loop Whole-Body Humanoid Teleoperation for Long-Horizon Tasks

Yixuan Li, Yutang Lin, Jieming Cui, Tengyu Liu, Wei Liang, Yixin Zhu, Siyuan Huang

TL;DR

<3-5 sentence high-level summary>

Abstract

Humanoid teleoperation plays a vital role in demonstrating and collecting data for complex humanoid-scene interactions. However, current teleoperation systems face critical limitations: they decouple upper- and lower-body control to maintain stability, restricting natural coordination, and operate open-loop without real-time position feedback, leading to accumulated drift. The fundamental challenge is achieving precise, coordinated whole-body teleoperation over extended durations while maintaining accurate global positioning. Here we show that an MoE-based teleoperation system, CLONE, with closed-loop error correction enables unprecedented whole-body teleoperation fidelity, maintaining minimal positional drift over long-range trajectories using only head and hand tracking from an MR headset. Unlike previous methods that either sacrifice coordination for stability or suffer from unbounded drift, CLONE learns diverse motion skills while preventing tracking error accumulation through real-time feedback, enabling complex coordinated movements such as ``picking up objects from the ground.'' These results establish a new milestone for whole-body humanoid teleoperation for long-horizon humanoid-scene interaction tasks.

CLONE: Closed-Loop Whole-Body Humanoid Teleoperation for Long-Horizon Tasks

TL;DR

<3-5 sentence high-level summary>

Abstract

Humanoid teleoperation plays a vital role in demonstrating and collecting data for complex humanoid-scene interactions. However, current teleoperation systems face critical limitations: they decouple upper- and lower-body control to maintain stability, restricting natural coordination, and operate open-loop without real-time position feedback, leading to accumulated drift. The fundamental challenge is achieving precise, coordinated whole-body teleoperation over extended durations while maintaining accurate global positioning. Here we show that an MoE-based teleoperation system, CLONE, with closed-loop error correction enables unprecedented whole-body teleoperation fidelity, maintaining minimal positional drift over long-range trajectories using only head and hand tracking from an MR headset. Unlike previous methods that either sacrifice coordination for stability or suffer from unbounded drift, CLONE learns diverse motion skills while preventing tracking error accumulation through real-time feedback, enabling complex coordinated movements such as ``picking up objects from the ground.'' These results establish a new milestone for whole-body humanoid teleoperation for long-horizon humanoid-scene interaction tasks.

Paper Structure

This paper contains 35 sections, 2 equations, 13 figures, 4 tables.

Figures (13)

  • Figure 1: Clone employs an moe-based policy with closed-loop error correction for humanoid teleoperation, enabling precise whole-body coordination and long-horizon task execution.
  • Figure 2: Whole-body humanoid teleoperation from minimal input. Our approach enables intuitive control of a humanoid robot using only head and hand poses from mixed reality input, generating coordinated whole-body motions including natural locomotion. Through closed-loop tracking, the system maintains accurate correspondence between operator and robot over extended operation periods, enabling complex long-horizon tasks that require sustained precision.
  • Figure 3: The Clone framework. (a) Cloned curates and augments retargeted AMASS mahmood2019amass data through motion editing to introduce diverse humanoid motions and detailed hand movements. (b) A teacher policy is trained using privileged information, including full robot state and environmental context. (c) An moe network serves as the student policy, distilled from the teacher to operate with real-world observations only. (d) For real-world deployment, we integrate LiDAR odometry to obtain real-time humanoid states, enabling closed-loop error correction during teleoperation.
  • Figure 4: Global position tracking accuracy in real-world experiments.Clone achieves mean tracking errors of 5.1cm across distances up to 8.9m, demonstrating effective closed-loop error correction in extended teleoperation.
  • Figure 5: Whole-body motion tracking on Unitree G1.Clone successfully tracks diverse skills including (a) waving, (b)(d) squatting, and (c)jumping, showcasing comprehensive whole-body coordination capabilities.
  • ...and 8 more figures