DAGDiff: Guiding Dual-Arm Grasp Diffusion to Stable and Collision-Free Grasps

Md Faizal Karim; Vignesh Vembar; Keshab Patra; Gaurav Singh; K Madhava Krishna

DAGDiff: Guiding Dual-Arm Grasp Diffusion to Stable and Collision-Free Grasps

Md Faizal Karim, Vignesh Vembar, Keshab Patra, Gaurav Singh, K Madhava Krishna

TL;DR

DAGDiff addresses reliable dual-arm grasping by generating grasp pairs directly in $SE(3) \times SE(3)$ using a diffusion process guided by force-closure and collision signals. It introduces an energy-based model $E_\alpha$ with score $s_\alpha$ and uses dual-arm mappings via $\operatorname{Logmap_2}$ and $\operatorname{Expmap_2}$ to enable end-to-end learning, augmented by two classifier-guidance modules for stability and collision avoidance. Training optimizes $\mathcal{L}_{\text{diff}}$, $\mathcal{L}_{\text{fc}}$, and $\mathcal{L}_{\text{col}}$, while inference incorporates gradient guidance and a collision-refinement threshold $t_c$ to produce low-energy, physically valid dual-arm grasps. Experiments on DG16M and Isaac Gym demonstrate substantial improvements over baselines, and zero-shot real-world tests on unseen objects confirm practical transfer to real sensor data with heterogeneous dual-arm hardware.

Abstract

Reliable dual-arm grasping is essential for manipulating large and complex objects but remains a challenging problem due to stability, collision, and generalization requirements. Prior methods typically decompose the task into two independent grasp proposals, relying on region priors or heuristics that limit generalization and provide no principled guarantee of stability. We propose DAGDiff, an end-to-end framework that directly denoises to grasp pairs in the SE(3) x SE(3) space. Our key insight is that stability and collision can be enforced more effectively by guiding the diffusion process with classifier signals, rather than relying on explicit region detection or object priors. To this end, DAGDiff integrates geometry-, stability-, and collision-aware guidance terms that steer the generative process toward grasps that are physically valid and force-closure compliant. We comprehensively evaluate DAGDiff through analytical force-closure checks, collision analysis, and large-scale physics-based simulations, showing consistent improvements over previous work on these metrics. Finally, we demonstrate that our framework generates dual-arm grasps directly on real-world point clouds of previously unseen objects, which are executed on a heterogeneous dual-arm setup where two manipulators reliably grasp and lift them.

DAGDiff: Guiding Dual-Arm Grasp Diffusion to Stable and Collision-Free Grasps

TL;DR

Abstract

DAGDiff: Guiding Dual-Arm Grasp Diffusion to Stable and Collision-Free Grasps

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)