Contact Map Transfer with Conditional Diffusion Model for Generalizable Dexterous Grasp Generation
Yiyao Ma, Kai Chen, Kexin Zheng, Qi Dou
TL;DR
This work tackles the challenge of generating stable and task-aligned dexterous grasps that generalize to unseen objects and tasks. It proposes a transfer-based framework that uses a conditional diffusion model to transfer a template-derived contact map to a target object, guided by shape similarity and textual task descriptions, and extends this to jointly transfer three object-centric maps—contact, part, and direction—via a cascaded diffusion process. A robust grasp optimization then identifies reliable regions and efficiently recovers grasp parameters by leveraging the transferred maps. Extensive experiments on CapGrasp SemGrasp demonstrate strong generalization to novel objects and categories, high task alignment, and superior performance over analytical, generative, and transfer-based baselines, highlighting the method’s potential for practical dexterous manipulation.
Abstract
Dexterous grasp generation is a fundamental challenge in robotics, requiring both grasp stability and adaptability across diverse objects and tasks. Analytical methods ensure stable grasps but are inefficient and lack task adaptability, while generative approaches improve efficiency and task integration but generalize poorly to unseen objects and tasks due to data limitations. In this paper, we propose a transfer-based framework for dexterous grasp generation, leveraging a conditional diffusion model to transfer high-quality grasps from shape templates to novel objects within the same category. Specifically, we reformulate the grasp transfer problem as the generation of an object contact map, incorporating object shape similarity and task specifications into the diffusion process. To handle complex shape variations, we introduce a dual mapping mechanism, capturing intricate geometric relationship between shape templates and novel objects. Beyond the contact map, we derive two additional object-centric maps, the part map and direction map, to encode finer contact details for more stable grasps. We then develop a cascaded conditional diffusion model framework to jointly transfer these three maps, ensuring their intra-consistency. Finally, we introduce a robust grasp recovery mechanism, identifying reliable contact points and optimizing grasp configurations efficiently. Extensive experiments demonstrate the superiority of our proposed method. Our approach effectively balances grasp quality, generation efficiency, and generalization performance across various tasks. Project homepage: https://cmtdiffusion.github.io/
