Unsupervised Neural Motion Retargeting for Humanoid Teleoperation
Satoshi Yagi, Mitsunori Tada, Eiji Uchibe, Suguru Kanoga, Takamitsu Matsubara, Jun Morimoto
TL;DR
This work tackles the challenge of human-to-humanoid teleoperation by removing the need for paired training data and manual joint pre-specifications. It introduces a CycleGAN-based framework that learns a shared latent representation for human and humanoid motions, with separate posture and motion encoders and a set of losses that enforce reconstruction, latent consistency, adversarial realism, and end-effector velocity alignment. The approach enables real-time retargeting and is validated on upper-body motions, with end-effector errors competing with IK-based methods and demonstrated via a real pick-and-place task using a Torobo humanoid. The results show promising usability and robustness to operator variation, while also highlighting data requirements and limitations in end-effector orientation, suggesting directions for future improvements and broader deployment in teleoperation contexts.
Abstract
This study proposes an approach to human-to-humanoid teleoperation using GAN-based online motion retargeting, which obviates the need for the construction of pairwise datasets to identify the relationship between the human and the humanoid kinematics. Consequently, it can be anticipated that our proposed teleoperation system will reduce the complexity and setup requirements typically associated with humanoid controllers, thereby facilitating the development of more accessible and intuitive teleoperation systems for users without robotics knowledge. The experiments demonstrated the efficacy of the proposed method in retargeting a range of upper-body human motions to humanoid, including a body jab motion and a basketball shoot motion. Moreover, the human-in-the-loop teleoperation performance was evaluated by measuring the end-effector position errors between the human and the retargeted humanoid motions. The results demonstrated that the error was comparable to those of conventional motion retargeting methods that require pairwise motion datasets. Finally, a box pick-and-place task was conducted to demonstrate the usability of the developed humanoid teleoperation system.
