Table of Contents
Fetching ...

Contact Map Transfer with Conditional Diffusion Model for Generalizable Dexterous Grasp Generation

Yiyao Ma, Kai Chen, Kexin Zheng, Qi Dou

TL;DR

This work tackles the challenge of generating stable and task-aligned dexterous grasps that generalize to unseen objects and tasks. It proposes a transfer-based framework that uses a conditional diffusion model to transfer a template-derived contact map to a target object, guided by shape similarity and textual task descriptions, and extends this to jointly transfer three object-centric maps—contact, part, and direction—via a cascaded diffusion process. A robust grasp optimization then identifies reliable regions and efficiently recovers grasp parameters by leveraging the transferred maps. Extensive experiments on CapGrasp SemGrasp demonstrate strong generalization to novel objects and categories, high task alignment, and superior performance over analytical, generative, and transfer-based baselines, highlighting the method’s potential for practical dexterous manipulation.

Abstract

Dexterous grasp generation is a fundamental challenge in robotics, requiring both grasp stability and adaptability across diverse objects and tasks. Analytical methods ensure stable grasps but are inefficient and lack task adaptability, while generative approaches improve efficiency and task integration but generalize poorly to unseen objects and tasks due to data limitations. In this paper, we propose a transfer-based framework for dexterous grasp generation, leveraging a conditional diffusion model to transfer high-quality grasps from shape templates to novel objects within the same category. Specifically, we reformulate the grasp transfer problem as the generation of an object contact map, incorporating object shape similarity and task specifications into the diffusion process. To handle complex shape variations, we introduce a dual mapping mechanism, capturing intricate geometric relationship between shape templates and novel objects. Beyond the contact map, we derive two additional object-centric maps, the part map and direction map, to encode finer contact details for more stable grasps. We then develop a cascaded conditional diffusion model framework to jointly transfer these three maps, ensuring their intra-consistency. Finally, we introduce a robust grasp recovery mechanism, identifying reliable contact points and optimizing grasp configurations efficiently. Extensive experiments demonstrate the superiority of our proposed method. Our approach effectively balances grasp quality, generation efficiency, and generalization performance across various tasks. Project homepage: https://cmtdiffusion.github.io/

Contact Map Transfer with Conditional Diffusion Model for Generalizable Dexterous Grasp Generation

TL;DR

This work tackles the challenge of generating stable and task-aligned dexterous grasps that generalize to unseen objects and tasks. It proposes a transfer-based framework that uses a conditional diffusion model to transfer a template-derived contact map to a target object, guided by shape similarity and textual task descriptions, and extends this to jointly transfer three object-centric maps—contact, part, and direction—via a cascaded diffusion process. A robust grasp optimization then identifies reliable regions and efficiently recovers grasp parameters by leveraging the transferred maps. Extensive experiments on CapGrasp SemGrasp demonstrate strong generalization to novel objects and categories, high task alignment, and superior performance over analytical, generative, and transfer-based baselines, highlighting the method’s potential for practical dexterous manipulation.

Abstract

Dexterous grasp generation is a fundamental challenge in robotics, requiring both grasp stability and adaptability across diverse objects and tasks. Analytical methods ensure stable grasps but are inefficient and lack task adaptability, while generative approaches improve efficiency and task integration but generalize poorly to unseen objects and tasks due to data limitations. In this paper, we propose a transfer-based framework for dexterous grasp generation, leveraging a conditional diffusion model to transfer high-quality grasps from shape templates to novel objects within the same category. Specifically, we reformulate the grasp transfer problem as the generation of an object contact map, incorporating object shape similarity and task specifications into the diffusion process. To handle complex shape variations, we introduce a dual mapping mechanism, capturing intricate geometric relationship between shape templates and novel objects. Beyond the contact map, we derive two additional object-centric maps, the part map and direction map, to encode finer contact details for more stable grasps. We then develop a cascaded conditional diffusion model framework to jointly transfer these three maps, ensuring their intra-consistency. Finally, we introduce a robust grasp recovery mechanism, identifying reliable contact points and optimizing grasp configurations efficiently. Extensive experiments demonstrate the superiority of our proposed method. Our approach effectively balances grasp quality, generation efficiency, and generalization performance across various tasks. Project homepage: https://cmtdiffusion.github.io/

Paper Structure

This paper contains 18 sections, 20 equations, 18 figures, 6 tables.

Figures (18)

  • Figure 1: Comparison of our proposed framework with existing analytical and generative methods. The proposed transfer-based framework can effectively balance efficiency, quality, task adaptability, and generalization capability for dexterous grasp generation.
  • Figure 2: Our conditional diffusion model learns to transfer contact maps via a template-target framework. The template branch encodes shape templates $(\mathbf{x}_e,\mathbf{c}_e)\rightarrow\mathbf{h}_e$, while the target branch denoises latent vector $\mathbf{z}_t$ conditioned on target geometry $\mathbf{x}_a$, template features $\mathbf{h}_e$, and language $\ell$. A bidirectional adaptation module $\mathcal{A}$ bridges these branches through feature integration.
  • Figure 3: An overview of the cascaded diffusion framework.
  • Figure 3: Quantitative ablation study results.
  • Figure 4: Qualitative comparison with Tink for contact map transfer on novel objects.
  • ...and 13 more figures