Table of Contents
Fetching ...

DDGC: Generative Deep Dexterous Grasping in Clutter

Jens Lundell, Francesco Verdoja, Ville Kyrki

TL;DR

DDGC presents a generative deep network for fast, collision-free dexterous grasping in clutter by predicting the 6-DOF grasp pose $\mathbf{p}$ and hand configuration $\mathbf{q}$ from a single RGB-D image. The method combines scene completion, image encoding, a coarse-to-fine grasp generator, a differentiable finger refinement layer, and a Wasserstein discriminator with dedicated losses to produce multiple high-quality grasps in under a second. Trained entirely on synthetic clutter data, DDGC outperforms Multi-FinGAN and the GraspIt! simulated-annealing planner in both simulation and real hardware, achieving higher grasp quality, greater clearance rates, and faster sampling by about $4$–$5\times$. The work demonstrates strong sim-to-real transfer without fine-tuning and provides a scalable path to practical multi-finger grasping in cluttered environments, thanks to its scene-aware encoding and coarse-to-fine refinement pipeline.

Abstract

Recent advances in multi-fingered robotic grasping have enabled fast 6-Degrees-Of-Freedom (DOF) single object grasping. Multi-finger grasping in cluttered scenes, on the other hand, remains mostly unexplored due to the added difficulty of reasoning over obstacles which greatly increases the computational time to generate high-quality collision-free grasps. In this work we address such limitations by introducing DDGC, a fast generative multi-finger grasp sampling method that can generate high quality grasps in cluttered scenes from a single RGB-D image. DDGC is built as a network that encodes scene information to produce coarse-to-fine collision-free grasp poses and configurations. We experimentally benchmark DDGC against the simulated-annealing planner in GraspIt! on 1200 simulated cluttered scenes and 7 real world scenes. The results show that DDGC outperforms the baseline on synthesizing high-quality grasps and removing clutter while being 5 times faster. This, in turn, opens up the door for using multi-finger grasps in practical applications which has so far been limited due to the excessive computation time needed by other methods.

DDGC: Generative Deep Dexterous Grasping in Clutter

TL;DR

DDGC presents a generative deep network for fast, collision-free dexterous grasping in clutter by predicting the 6-DOF grasp pose and hand configuration from a single RGB-D image. The method combines scene completion, image encoding, a coarse-to-fine grasp generator, a differentiable finger refinement layer, and a Wasserstein discriminator with dedicated losses to produce multiple high-quality grasps in under a second. Trained entirely on synthetic clutter data, DDGC outperforms Multi-FinGAN and the GraspIt! simulated-annealing planner in both simulation and real hardware, achieving higher grasp quality, greater clearance rates, and faster sampling by about . The work demonstrates strong sim-to-real transfer without fine-tuning and provides a scalable path to practical multi-finger grasping in cluttered environments, thanks to its scene-aware encoding and coarse-to-fine refinement pipeline.

Abstract

Recent advances in multi-fingered robotic grasping have enabled fast 6-Degrees-Of-Freedom (DOF) single object grasping. Multi-finger grasping in cluttered scenes, on the other hand, remains mostly unexplored due to the added difficulty of reasoning over obstacles which greatly increases the computational time to generate high-quality collision-free grasps. In this work we address such limitations by introducing DDGC, a fast generative multi-finger grasp sampling method that can generate high quality grasps in cluttered scenes from a single RGB-D image. DDGC is built as a network that encodes scene information to produce coarse-to-fine collision-free grasp poses and configurations. We experimentally benchmark DDGC against the simulated-annealing planner in GraspIt! on 1200 simulated cluttered scenes and 7 real world scenes. The results show that DDGC outperforms the baseline on synthesizing high-quality grasps and removing clutter while being 5 times faster. This, in turn, opens up the door for using multi-finger grasps in practical applications which has so far been limited due to the excessive computation time needed by other methods.

Paper Structure

This paper contains 16 sections, 6 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Using a single RGB-D image of a cluttered scene as input, our proposed generative grasp planner can produce up to 10 collision-free multi-fingered grasps with various grasp types in less than a second.
  • Figure 2: Example scenes from our data-set with one up to four objects.
  • Figure 3: (a) shows that DDGC constantly finds higher-quality grasps than both GraspIt! and Multi-FinGAN. (b) shows a histograms of the finger spread of grasps generated with DDGC.
  • Figure 4: Some example grasps shown in the top row proposed by DDGC in simulated scenes using the RGB input shown on the bottom row. The black background represents the table surface (best viewed in color).
  • Figure 5: Scenes used for testing
  • ...and 1 more figures