Table of Contents
Fetching ...

6-DoF Grasp Detection in Clutter with Enhanced Receptive Field and Graspable Balance Sampling

Hanwen Wang, Ying Zhang, Yunlong Wang, Jian Li

TL;DR

This work tackles 6-DoF grasp detection in clutter with a focus on small-scale grasps. It introduces an enhanced receptive field via Multi-radii Cylinder Grouping and a Passive Attention module, plus a segmentation-guided Graspable Balance Sampling strategy to ensure balanced attention to small objects. The approach yields about a $10\%$ improvement in AP on GraspNet-1Billion and demonstrates strong performance in PyBullet simulations and real-world tests, including small- and mixed-scale grasping. The findings highlight the value of combining geometry-aware receptive-field expansion with semantics-guided sampling to boost fine-grained grasp perception and generalization in cluttered environments.

Abstract

6-DoF grasp detection of small-scale grasps is crucial for robots to perform specific tasks. This paper focuses on enhancing the recognition capability of small-scale grasping, aiming to improve the overall accuracy of grasping prediction results and the generalization ability of the network. We propose an enhanced receptive field method that includes a multi-radii cylinder grouping module and a passive attention module. This method enhances the receptive field area within the graspable space and strengthens the learning of graspable features. Additionally, we design a graspable balance sampling module based on a segmentation network, which enables the network to focus on features of small objects, thereby improving the recognition capability of small-scale grasping. Our network achieves state-of-the-art performance on the GraspNet-1Billion dataset, with an overall improvement of approximately 10% in average precision@k (AP). Furthermore, we deployed our grasp detection model in pybullet grasping platform, which validates the effectiveness of our method.

6-DoF Grasp Detection in Clutter with Enhanced Receptive Field and Graspable Balance Sampling

TL;DR

This work tackles 6-DoF grasp detection in clutter with a focus on small-scale grasps. It introduces an enhanced receptive field via Multi-radii Cylinder Grouping and a Passive Attention module, plus a segmentation-guided Graspable Balance Sampling strategy to ensure balanced attention to small objects. The approach yields about a improvement in AP on GraspNet-1Billion and demonstrates strong performance in PyBullet simulations and real-world tests, including small- and mixed-scale grasping. The findings highlight the value of combining geometry-aware receptive-field expansion with semantics-guided sampling to boost fine-grained grasp perception and generalization in cluttered environments.

Abstract

6-DoF grasp detection of small-scale grasps is crucial for robots to perform specific tasks. This paper focuses on enhancing the recognition capability of small-scale grasping, aiming to improve the overall accuracy of grasping prediction results and the generalization ability of the network. We propose an enhanced receptive field method that includes a multi-radii cylinder grouping module and a passive attention module. This method enhances the receptive field area within the graspable space and strengthens the learning of graspable features. Additionally, we design a graspable balance sampling module based on a segmentation network, which enables the network to focus on features of small objects, thereby improving the recognition capability of small-scale grasping. Our network achieves state-of-the-art performance on the GraspNet-1Billion dataset, with an overall improvement of approximately 10% in average precision@k (AP). Furthermore, we deployed our grasp detection model in pybullet grasping platform, which validates the effectiveness of our method.
Paper Structure (13 sections, 4 equations, 8 figures, 6 tables, 1 algorithm)

This paper contains 13 sections, 4 equations, 8 figures, 6 tables, 1 algorithm.

Figures (8)

  • Figure 1: We categorize small-scale grasping into two classes. The first class involves grasping small parts of medium to large objects on the tabletop. The second class pertains to grasping small objects at the tabletop level.
  • Figure 2: Pipeline. The network initially generates multiple features through the backbone, followed by the graspable predictor predicting points with high graspness. The graspable balance sampling module has two modes: during training, it directly uses the farthest point sampling without employing the guidance of pre-trained segmentation model features. The model-guided sampling is utilized only during inference. The features are then fed into the ApproachNet to select the optimal grasp views, which is subsequently input into our enhanced receptive field for cylinder grouping. Finally, the input is processed by SWADNet to output dense grasp poses.
  • Figure 3: Grasp representation and gripper coordinate system.
  • Figure 4: Schematic diagram of the cylinder grouping module. On the left is the conventional single-radii module, and on the right is our multi-radii module.
  • Figure 5: On the left is our designed enhanced receptive field method. On the right is the pipeline of our passive attention module.
  • ...and 3 more figures