Table of Contents
Fetching ...

Constrained Generative Sampling of 6-DoF Grasps

Jens Lundell, Francesco Verdoja, Tran Nguyen Le, Arsalan Mousavian, Dieter Fox, Ville Kyrki

TL;DR

This work introduces VCGS, a variational, constrained 6-DoF grasp sampler, to generate grasps targeted to arbitrary regions on an object, addressing inefficiency in unconstrained sampling for task-specific manipulation. It pairs VCGS with CONG, a large-scale dataset of millions of constrained grasps across 2889 objects, enabling training without task-specific labels. Across simulation and real-robot experiments, VCGS surpasses a state-of-the-art unconstrained baseline (GraspNet) by about 10–15% in grasp success and by 2–3× in sample efficiency, demonstrating the value of general constrained grasping. The approach combines a CVAE-based constrained grasp sampler with a grasp evaluator, enabling robust sampling and filtering, and points to future work on richer constraint types and constraint-conditioned evaluation.

Abstract

Most state-of-the-art data-driven grasp sampling methods propose stable and collision-free grasps uniformly on the target object. For bin-picking, executing any of those reachable grasps is sufficient. However, for completing specific tasks, such as squeezing out liquid from a bottle, we want the grasp to be on a specific part of the object's body while avoiding other locations, such as the cap. This work presents a generative grasp sampling network, VCGS, capable of constrained 6 Degrees of Freedom (DoF) grasp sampling. In addition, we also curate a new dataset designed to train and evaluate methods for constrained grasping. The new dataset, called CONG, consists of over 14 million training samples of synthetically rendered point clouds and grasps at random target areas on 2889 objects. VCGS is benchmarked against GraspNet, a state-of-the-art unconstrained grasp sampler, in simulation and on a real robot. The results demonstrate that VCGS achieves a 10-15% higher grasp success rate than the baseline while being 2-3 times as sample efficient. Supplementary material is available on our project website.

Constrained Generative Sampling of 6-DoF Grasps

TL;DR

This work introduces VCGS, a variational, constrained 6-DoF grasp sampler, to generate grasps targeted to arbitrary regions on an object, addressing inefficiency in unconstrained sampling for task-specific manipulation. It pairs VCGS with CONG, a large-scale dataset of millions of constrained grasps across 2889 objects, enabling training without task-specific labels. Across simulation and real-robot experiments, VCGS surpasses a state-of-the-art unconstrained baseline (GraspNet) by about 10–15% in grasp success and by 2–3× in sample efficiency, demonstrating the value of general constrained grasping. The approach combines a CVAE-based constrained grasp sampler with a grasp evaluator, enabling robust sampling and filtering, and points to future work on richer constraint types and constraint-conditioned evaluation.

Abstract

Most state-of-the-art data-driven grasp sampling methods propose stable and collision-free grasps uniformly on the target object. For bin-picking, executing any of those reachable grasps is sufficient. However, for completing specific tasks, such as squeezing out liquid from a bottle, we want the grasp to be on a specific part of the object's body while avoiding other locations, such as the cap. This work presents a generative grasp sampling network, VCGS, capable of constrained 6 Degrees of Freedom (DoF) grasp sampling. In addition, we also curate a new dataset designed to train and evaluate methods for constrained grasping. The new dataset, called CONG, consists of over 14 million training samples of synthetically rendered point clouds and grasps at random target areas on 2889 objects. VCGS is benchmarked against GraspNet, a state-of-the-art unconstrained grasp sampler, in simulation and on a real robot. The results demonstrate that VCGS achieves a 10-15% higher grasp success rate than the baseline while being 2-3 times as sample efficient. Supplementary material is available on our project website.
Paper Structure (14 sections, 3 equations, 5 figures, 3 tables)

This paper contains 14 sections, 3 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: An example grasp generated by VCGS on the target grasping area highlighted in red.
  • Figure 2: The grasp in green, its point cloud representation in red, the center grasp point in cyan, and the distance d between it and the black point on the object. The center grasp point is set to the average of the two leftmost and the two rightmost points of the gripper.
  • Figure 3: An example of how the dataset is curated. (a) From the object mesh, (b) a point cloud is rendered, and a query point, highlighted in red, is selected. Given the query point, (c) all neighbors within a specific radius from it are found, and (d) the grasps close to those points are stored.
  • Figure 4: An example grasp from the simulation.
  • Figure 5: The 10 objects used in the real-world experiment. All objects, except (a), (j), (k), and (l), are from the YCB object dataset calliYCBObjectModel2015. The dashed red lines depict the target grasping area for each object.