Learning Adaptive Neural Teleoperation for Humanoid Robots: From Inverse Kinematics to End-to-End Control
Sanjar Atamuradov
TL;DR
This paper tackles the limitations of traditional VR teleoperation for humanoid robots, where IK+PD pipelines struggle with force disturbances, motion artifacts, and user-specific adaptation. It proposes an end-to-end neural teleoperation framework that directly maps VR controller poses and robot proprioception to joint commands, using a VR encoder, a proprioception encoder, and an LSTM head to ensure temporal coherence. Training proceeds in three stages—imitation from IK demonstrations, RL fine-tuning with smoothness and tracking rewards, and a force adaptation curriculum—followed by sim-to-real transfer via domain randomization and asymmetric critics; the system runs at 50 Hz with real-time performance on the Unitree G1. Empirical results show 34% lower tracking error, 45% smoother motions, and high user preference (87%), with successful sim-to-real transfer and robust force adaptation across manipulation tasks, highlighting the potential of learned teleoperation for natural, robust human-robot collaboration.
Abstract
Virtual reality (VR) teleoperation has emerged as a promising approach for controlling humanoid robots in complex manipulation tasks. However, traditional teleoperation systems rely on inverse kinematics (IK) solvers and hand-tuned PD controllers, which struggle to handle external forces, adapt to different users, and produce natural motions under dynamic conditions. In this work, we propose a learning-based neural teleoperation framework that replaces the conventional IK+PD pipeline with learned policies trained via reinforcement learning. Our approach learns to directly map VR controller inputs to robot joint commands while implicitly handling force disturbances, producing smooth trajectories, and adapting to user preferences. We train our policies in simulation using demonstrations collected from IK-based teleoperation as initialization, then fine-tune them with force randomization and trajectory smoothness rewards. Experiments on the Unitree G1 humanoid robot demonstrate that our learned policies achieve 34% lower tracking error, 45% smoother motions, and superior force adaptation compared to the IK baseline, while maintaining real-time performance (50Hz control frequency). We validate our approach on manipulation tasks including object pick-and-place, door opening, and bimanual coordination. These results suggest that learning-based approaches can significantly improve the naturalness and robustness of humanoid teleoperation systems.
