Table of Contents
Fetching ...

Gravity-aware Grasp Generation with Implicit Grasp Mode Selection for Underactuated Hands

Tianyi Ko, Takuya Ikeda, Thomas Stewart, Robert Lee, Koichi Nishiwaki

TL;DR

This work tackles the fragility of precision grasps by introducing gravity-aware learning that prioritizes power grasps through a continuous gravity-rejection score $f_g$, while retaining precision grasping when power grasps are not feasible. A data-generation pipeline creates densely annotated grasps (both power and precision) and assigns $f_g$ by simulating disturbance in multiple gravity directions and projecting to the scene via $f_g = \min_{i: \bm{n}_i \cdot \bm{n}_g > \epsilon} f_i /(\bm{n}_i \cdot \bm{n}_g)$, enabling training with a 3D fully convolutional network that outputs $f_g$ and a grasp validness score. The approach uses a voxel-based grasp representation, an $L2$ loss on positive samples for $f_g$, and a grasp validness head to curb out-of-domain predictions, leading to improved robustness especially for heavy objects as demonstrated in simulation and validated on a physical Robotiq hand. This work advances practical grasping for underactuated hands by explicitly modeling gravity-driven robustness and enabling automatic fallback to precision grasps when necessary, with strong implications for real-world manipulation tasks.

Abstract

Learning-based grasp detectors typically assume a precision grasp, where each finger only has one contact point, and estimate the grasp probability. In this work, we propose a data generation and learning pipeline that can leverage power grasping, which has more contact points with an enveloping configuration and is robust against both positioning error and force disturbance. To train a grasp detector to prioritize power grasping while still keeping precision grasping as the secondary choice, we propose to train the network against the magnitude of disturbance in the gravity direction a grasp can resist (gravity-rejection score) rather than the binary classification of success. We also provide an efficient data generation pipeline for a dataset with gravity-rejection score annotation. In addition to thorough ablation studies, quantitative evaluation in both simulation and real-robot clarifies the significant improvement in our approach, especially when the objects are heavy.

Gravity-aware Grasp Generation with Implicit Grasp Mode Selection for Underactuated Hands

TL;DR

This work tackles the fragility of precision grasps by introducing gravity-aware learning that prioritizes power grasps through a continuous gravity-rejection score , while retaining precision grasping when power grasps are not feasible. A data-generation pipeline creates densely annotated grasps (both power and precision) and assigns by simulating disturbance in multiple gravity directions and projecting to the scene via , enabling training with a 3D fully convolutional network that outputs and a grasp validness score. The approach uses a voxel-based grasp representation, an loss on positive samples for , and a grasp validness head to curb out-of-domain predictions, leading to improved robustness especially for heavy objects as demonstrated in simulation and validated on a physical Robotiq hand. This work advances practical grasping for underactuated hands by explicitly modeling gravity-driven robustness and enabling automatic fallback to precision grasps when necessary, with strong implications for real-world manipulation tasks.

Abstract

Learning-based grasp detectors typically assume a precision grasp, where each finger only has one contact point, and estimate the grasp probability. In this work, we propose a data generation and learning pipeline that can leverage power grasping, which has more contact points with an enveloping configuration and is robust against both positioning error and force disturbance. To train a grasp detector to prioritize power grasping while still keeping precision grasping as the secondary choice, we propose to train the network against the magnitude of disturbance in the gravity direction a grasp can resist (gravity-rejection score) rather than the binary classification of success. We also provide an efficient data generation pipeline for a dataset with gravity-rejection score annotation. In addition to thorough ablation studies, quantitative evaluation in both simulation and real-robot clarifies the significant improvement in our approach, especially when the objects are heavy.
Paper Structure (13 sections, 2 equations, 9 figures, 2 tables)

This paper contains 13 sections, 2 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: While existing works only handle precision grasping, power grasping is more robust against both initial position error and post-grasp force disturbance. We propose a data generation pipeline and neural network model that prioritizes power grasping if applicable (left) but implicitly switches back to precision grasping if power grasping is not available due to collision with the table (middle) or other objects (right.)
  • Figure 2: (a) Most works assume a precision grasp (left) due to its simplicity. A power grasp (right) allows a more robust grasp thanks to the larger number of contacts. (b) Illustration of Eq. \ref{['eq:gravity_projection']} which approximates the multidimensional disturbance-rejection score to the gravity direction gravity-rejection score.
  • Figure 3: Schematic of our data generation pipeline. We first sample antipodal grasps and randomize them (top-left). Those initial grasps are refined through grasp simulation (top-right). We apply external force in multiple directions and acquire a multi-dimensional disturbance-rejection score for each grasp. Finally, we create multi-object scenes and project per-object grasp poses to the scene to acquire the training data (bottom). The disturbance-rejection score is projected to the scene's gravity direction to acquire the gravity-rejection score.
  • Figure 4: TCP coordinate definition (left) and scene for the static network input/output analysis (right.) A $\diameter$65 $\times$ 200 mm cylinder and a 100 $\times$ 80 $\times$ 100 mm cube are placed on the surface and the input/output of the network is visualized with the three blue-colored cross-sections.
  • Figure 5: Input TSDF volume (left) and output grasp volume (right) with $z=0.15$ cross-section. The markers with black-circle are voxels with positive classification. The short blue lines are the hand approach direction.
  • ...and 4 more figures