Constrained 6-DoF Grasp Generation on Complex Shapes for Improved Dual-Arm Manipulation
Gaurav Singh, Sanket Kalwar, Md Faizal Karim, Bipasha Sen, Nagamanikandan Govindan, Srinath Sridhar, K Madhava Krishna
TL;DR
This work tackles constrained 6-DoF grasp generation on complex shapes for dual-arm manipulation by introducing CGDF, a diffusion-based model that uses part-guided diffusion and convolutional plane features to generate dense, region-specific grasps without requiring constraint-augmented training data. The method operates on the SE(3) manifold with an energy-based diffusion framework, employing Logmap/Expmap mappings and a neural energy decoder to score grasps, while guiding diffusion toward target regions via a max-energy formulation. Empirical results on the DA$^2$ dataset show CGDF outperforms state-of-the-art constrained and unconstrained baselines in Force Closure, Grasp Success Rate, and Target Grasps, validating its effectiveness for dual-arm planning and broader object geometries. The approach promises practical impact for robust, region-aware manipulation in real-world robotics by enabling sample-efficient constrained grasping without extensive labeled datasets.
Abstract
Efficiently generating grasp poses tailored to specific regions of an object is vital for various robotic manipulation tasks, especially in a dual-arm setup. This scenario presents a significant challenge due to the complex geometries involved, requiring a deep understanding of the local geometry to generate grasps efficiently on the specified constrained regions. Existing methods only explore settings involving table-top/small objects and require augmented datasets to train, limiting their performance on complex objects. We propose CGDF: Constrained Grasp Diffusion Fields, a diffusion-based grasp generative model that generalizes to objects with arbitrary geometries, as well as generates dense grasps on the target regions. CGDF uses a part-guided diffusion approach that enables it to get high sample efficiency in constrained grasping without explicitly training on massive constraint-augmented datasets. We provide qualitative and quantitative comparisons using analytical metrics and in simulation, in both unconstrained and constrained settings to show that our method can generalize to generate stable grasps on complex objects, especially useful for dual-arm manipulation settings, while existing methods struggle to do so.
