Cross-Embodiment Dexterous Hand Articulation Generation via Morphology-Aware Learning

Heng Zhang; Kevin Yuchen Ma; Mike Zheng Shou; Weisi Lin; Yan Wu

Cross-Embodiment Dexterous Hand Articulation Generation via Morphology-Aware Learning

Heng Zhang, Kevin Yuchen Ma, Mike Zheng Shou, Weisi Lin, Yan Wu

TL;DR

This work tackles cross-embodiment dexterous grasp generation by introducing an eigengrasp-based end-to-end framework that derives a morphology embedding and hand-specific eigengrasps from a URDF. An amplitude predictor, conditioned on object geometry and wrist pose, outputs coefficients to reconstruct full joint articulations, supervised by a Kinematic-Aware Articulation Loss that emphasizes fingertip-relevant motions. The approach is evaluated across three hands in simulation and on a real robot, achieving a 91.9% average success on unseen objects with fast inference, and demonstrating strong few-shot generalization to unseen hands and successful real-world transfer. These results demonstrate scalable cross-embodiment grasp generation without hand-specific retraining, enabling practical dexterous manipulation across diverse robotic morphologies.

Abstract

Dexterous grasping with multi-fingered hands remains challenging due to high-dimensional articulations and the cost of optimization-based pipelines. Existing end-to-end methods require training on large-scale datasets for specific hands, limiting their ability to generalize across different embodiments. We propose an eigengrasp-based, end-to-end framework for cross-embodiment grasp generation. From a hand's morphology description, we derive a morphology embedding and an eigengrasp set. Conditioned on these, together with the object point cloud and wrist pose, an amplitude predictor regresses articulation coefficients in a low-dimensional space, which are decoded into full joint articulations. Articulation learning is supervised with a Kinematic-Aware Articulation Loss (KAL) that emphasizes fingertip-relevant motions and injects morphology-specific structure. In simulation on unseen objects across three dexterous hands, our model attains a 91.9% average grasp success rate with less than 0.4 seconds inference per grasp. With few-shot adaptation to an unseen hand, it achieves 85.6% success on unseen objects in simulation, and real-world experiments on this few-shot generalized hand achieve an 87% success rate. The code and additional materials will be made available upon publication on our project website https://connor-zh.github.io/cross_embodiment_dexterous_grasping.

Cross-Embodiment Dexterous Hand Articulation Generation via Morphology-Aware Learning

TL;DR

Abstract

Cross-Embodiment Dexterous Hand Articulation Generation via Morphology-Aware Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)