Learning to Transfer Human Hand Skills for Robot Manipulations
Sungjae Park, Seungho Lee, Mingi Choi, Jiye Lee, Jeonghwan Kim, Jisoo Kim, Hanbyul Joo
TL;DR
The paper tackles the problem of transferring human hand dexterous manipulation to robot hands amid embodiment gaps by learning a joint spatio-temporal manifold over object trajectories, human hand motions, and robot actions, trained with pseudo-ground-truth triplets synthesized from separate mocap and teleoperation data. A convolutional autoencoder encodes $( extbf{O}, extbf{H}, extbf{R})$ into a latent code $ extbf{L}$, enabling inference of robot actions $ extbf{R}$ from given $ extbf{O}$ and $ extbf{H}$ via latent optimization, with an initial $ extbf{L}^{init}$ derived from a regression-based IK estimate. The proposed Hand-to-Robot Retargeting Model F and the synthetic data generation pipeline (Model S) achieve superior real-world performance on Bottle, Bowl, and Book tasks, demonstrating improved physical plausibility, robustness to mocap noise, and generalization to unseen trajectories. This approach offers a scalable data-driven path for translating human manipulation into robotic dexterity by explicitly modeling hand–object interactions rather than relying solely on kinematic point-matching.
Abstract
We present a method for teaching dexterous manipulation tasks to robots from human hand motion demonstrations. Unlike existing approaches that solely rely on kinematics information without taking into account the plausibility of robot and object interaction, our method directly infers plausible robot manipulation actions from human motion demonstrations. To address the embodiment gap between the human hand and the robot system, our approach learns a joint motion manifold that maps human hand movements, robot hand actions, and object movements in 3D, enabling us to infer one motion component from others. Our key idea is the generation of pseudo-supervision triplets, which pair human, object, and robot motion trajectories synthetically. Through real-world experiments with robot hand manipulation, we demonstrate that our data-driven retargeting method significantly outperforms conventional retargeting techniques, effectively bridging the embodiment gap between human and robotic hands. Website at https://rureadyo.github.io/MocapRobot/.
