Table of Contents
Fetching ...

UGG: Unified Generative Grasping

Jiaxin Lu, Hao Kang, Haoxiang Li, Bo Liu, Yiding Yang, Qixing Huang, Gang Hua

TL;DR

A unified diffusion-based dexterous grasp generation model, dubbed the name UGG, which operates within the object point cloud and hand parameter spaces and achieves state-of-the-art dexterous grasping on the large-scale DexGraspNet dataset while facilitating human-centric object design, marking a significant advancement in dexterous grasping research.

Abstract

Dexterous grasping aims to produce diverse grasping postures with a high grasping success rate. Regression-based methods that directly predict grasping parameters given the object may achieve a high success rate but often lack diversity. Generation-based methods that generate grasping postures conditioned on the object can often produce diverse grasping, but they are insufficient for high grasping success due to lack of discriminative information. To mitigate, we introduce a unified diffusion-based dexterous grasp generation model, dubbed the name UGG, which operates within the object point cloud and hand parameter spaces. Our all-transformer architecture unifies the information from the object, the hand, and the contacts, introducing a novel representation of contact points for improved contact modeling. The flexibility and quality of our model enable the integration of a lightweight discriminator, benefiting from simulated discriminative data, which pushes for a high success rate while preserving high diversity. Beyond grasp generation, our model can also generate objects based on hand information, offering valuable insights into object design and studying how the generative model perceives objects. Our model achieves state-of-the-art dexterous grasping on the large-scale DexGraspNet dataset while facilitating human-centric object design, marking a significant advancement in dexterous grasping research. Our project page is https://jiaxin-lu.github.io/ugg/.

UGG: Unified Generative Grasping

TL;DR

A unified diffusion-based dexterous grasp generation model, dubbed the name UGG, which operates within the object point cloud and hand parameter spaces and achieves state-of-the-art dexterous grasping on the large-scale DexGraspNet dataset while facilitating human-centric object design, marking a significant advancement in dexterous grasping research.

Abstract

Dexterous grasping aims to produce diverse grasping postures with a high grasping success rate. Regression-based methods that directly predict grasping parameters given the object may achieve a high success rate but often lack diversity. Generation-based methods that generate grasping postures conditioned on the object can often produce diverse grasping, but they are insufficient for high grasping success due to lack of discriminative information. To mitigate, we introduce a unified diffusion-based dexterous grasp generation model, dubbed the name UGG, which operates within the object point cloud and hand parameter spaces. Our all-transformer architecture unifies the information from the object, the hand, and the contacts, introducing a novel representation of contact points for improved contact modeling. The flexibility and quality of our model enable the integration of a lightweight discriminator, benefiting from simulated discriminative data, which pushes for a high success rate while preserving high diversity. Beyond grasp generation, our model can also generate objects based on hand information, offering valuable insights into object design and studying how the generative model perceives objects. Our model achieves state-of-the-art dexterous grasping on the large-scale DexGraspNet dataset while facilitating human-centric object design, marking a significant advancement in dexterous grasping research. Our project page is https://jiaxin-lu.github.io/ugg/.
Paper Structure (37 sections, 15 equations, 14 figures, 5 tables, 3 algorithms)

This paper contains 37 sections, 15 equations, 14 figures, 5 tables, 3 algorithms.

Figures (14)

  • Figure 1: Overview of three tasks performed by the proposed UGG model: (a) Generating grasps with a fixed object involves denoising hand parameters and transformations for diverse successful postures. (b) Generating objects with a fixed hand posture denoises shape latents for varied object fits. (c) Jointly generating hand posture and object involves simultaneous denoising of both latents, yielding diverse grasps with objects.
  • Figure 2: Overview of the proposed method UGG: Our approach involves encoding and embedding the object, contact anchors, and hand to facilitate the learning of a unified diffusion model. During inference, random seeds are sampled and subjected to a denoising process to generate samples. To discern potentially successful grasps, a physics discriminator is introduced. Subsequently, an optimization stage is undertaken for all selected grasps, utilizing the generated contact anchor and input point cloud.
  • Figure 3: Motivation of a unified contact modeling. (a) Subtle object changes can lead to grasping failure, as the corresponding adjustment in latent representation may not capture critical details adequately. (b) Contact map fails when embedded into a joint generation model. Left: generated noisy contact map (deeper for closer). Right: generated Contact Anchor (yellow) with a grasp and the GT contact map of the generated grasp. Zoom in for better view.
  • Figure 4: Visualization of the generated diverse grasps of UGG on the DexGraspNet objects (mesh used only for visualization). Top: grasps of objects from seen categories; Bottom: grasps for objects of novel categories.
  • Figure 5: Visualization of Contact Anchor. The yellow points are the generated contact anchors. Zoom in for a better view.
  • ...and 9 more figures