ImitationNet: Unsupervised Human-to-Robot Motion Retargeting via Shared Latent Space
Yashuai Yan, Esteve Valls Mascaro, Dongheui Lee
TL;DR
ImitationNet introduces an unsupervised framework for human-to-robot motion retargeting by learning a shared latent space for human poses and robot joints via adaptive contrastive learning and a global-rotation similarity metric. Encoders map human and robot poses to a common latent representation, and a decoder translates latent codes into robot joint commands, enabling direct control and latent-space interpolation for in-between motions. The method eliminates the need for paired data, achieves real-time performance on a Tiago++ robot, and supports multiple input modalities including text and RGB video. Results show improved retargeting precision and efficiency over a supervised baseline and demonstrate scalable generalization to new robots and modalities.
Abstract
This paper introduces a novel deep-learning approach for human-to-robot motion retargeting, enabling robots to mimic human poses accurately. Contrary to prior deep-learning-based works, our method does not require paired human-to-robot data, which facilitates its translation to new robots. First, we construct a shared latent space between humans and robots via adaptive contrastive learning that takes advantage of a proposed cross-domain similarity metric between the human and robot poses. Additionally, we propose a consistency term to build a common latent space that captures the similarity of the poses with precision while allowing direct robot motion control from the latent space. For instance, we can generate in-between motion through simple linear interpolation between two projected human poses. We conduct a comprehensive evaluation of robot control from diverse modalities (i.e., texts, RGB videos, and key poses), which facilitates robot control for non-expert users. Our model outperforms existing works regarding human-to-robot retargeting in terms of efficiency and precision. Finally, we implemented our method in a real robot with self-collision avoidance through a whole-body controller to showcase the effectiveness of our approach. More information on our website https://evm7.github.io/UnsH2R/
