Table of Contents
Fetching ...

OPG-Policy: Occluded Push-Grasp Policy Learning with Amodal Segmentation

Hao Ding, Yiming Zeng, Zhaoliang Wan, Hui Cheng

TL;DR

OPG-Policy tackles occluded goal-oriented grasping by integrating amodal segmentation to predict occluded target regions and guiding push-grasp actions through a Deep Q-Network. The framework comprises an amodal segmentation module, heightmap-based state representation, rotated-view Q-networks for push and grasp, and a coordinator that selects the action type using domain-informed features. Training uses a staged curriculum with adaptive rewards, including a dynamically updated threshold $T_g$ and a TD objective with $\delta_t$ and $\gamma$, enabling coordinated learning of pushing and grasping in clutter. Experimental results in simulation and real-world setups demonstrate superior motion efficiency and higher success rates than baselines, with strong generalization without real-world fine-tuning thanks to the amodal representations and coordinated action selection.

Abstract

Goal-oriented grasping in dense clutter, a fundamental challenge in robotics, demands an adaptive policy to handle occluded target objects and diverse configurations. Previous methods typically learn policies based on partially observable segments of the occluded target to generate motions. However, these policies often struggle to generate optimal motions due to uncertainties regarding the invisible portions of different occluded target objects across various scenes, resulting in low motion efficiency. To this end, we propose OPG-Policy, a novel framework that leverages amodal segmentation to predict occluded portions of the target and develop an adaptive push-grasp policy for cluttered scenarios where the target object is partially observed. Specifically, our approach trains a dedicated amodal segmentation module for diverse target objects to generate amodal masks. These masks and scene observations are mapped to the future rewards of grasp and push motion primitives via deep Q-learning to learn the motion critic. Afterward, the push and grasp motion candidates predicted by the critic, along with the relevant domain knowledge, are fed into the coordinator to generate the optimal motion implemented by the robot. Extensive experiments conducted in both simulated and real-world environments demonstrate the effectiveness of our approach in generating motion sequences for retrieving occluded targets, outperforming other baseline methods in success rate and motion efficiency.

OPG-Policy: Occluded Push-Grasp Policy Learning with Amodal Segmentation

TL;DR

OPG-Policy tackles occluded goal-oriented grasping by integrating amodal segmentation to predict occluded target regions and guiding push-grasp actions through a Deep Q-Network. The framework comprises an amodal segmentation module, heightmap-based state representation, rotated-view Q-networks for push and grasp, and a coordinator that selects the action type using domain-informed features. Training uses a staged curriculum with adaptive rewards, including a dynamically updated threshold and a TD objective with and , enabling coordinated learning of pushing and grasping in clutter. Experimental results in simulation and real-world setups demonstrate superior motion efficiency and higher success rates than baselines, with strong generalization without real-world fine-tuning thanks to the amodal representations and coordinated action selection.

Abstract

Goal-oriented grasping in dense clutter, a fundamental challenge in robotics, demands an adaptive policy to handle occluded target objects and diverse configurations. Previous methods typically learn policies based on partially observable segments of the occluded target to generate motions. However, these policies often struggle to generate optimal motions due to uncertainties regarding the invisible portions of different occluded target objects across various scenes, resulting in low motion efficiency. To this end, we propose OPG-Policy, a novel framework that leverages amodal segmentation to predict occluded portions of the target and develop an adaptive push-grasp policy for cluttered scenarios where the target object is partially observed. Specifically, our approach trains a dedicated amodal segmentation module for diverse target objects to generate amodal masks. These masks and scene observations are mapped to the future rewards of grasp and push motion primitives via deep Q-learning to learn the motion critic. Afterward, the push and grasp motion candidates predicted by the critic, along with the relevant domain knowledge, are fed into the coordinator to generate the optimal motion implemented by the robot. Extensive experiments conducted in both simulated and real-world environments demonstrate the effectiveness of our approach in generating motion sequences for retrieving occluded targets, outperforming other baseline methods in success rate and motion efficiency.

Paper Structure

This paper contains 21 sections, 8 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Example configuration. The target is the green cube occluded by a block on the top and surrounded by other blocks. We propose OPG-Policy to generate motion sequences with the assistance of amodal segmentation for occluded targets retrieval.
  • Figure 2: Occluded Push-Grasp Policy (OPG-Policy) Pipeline. We fix a camera to capture RGB-D images of the workspace, which are then fed into the Amodal Segmentation Module to obtain the amodal mask of the severely occluded target. Next, RGB-D and amodal mask images are orthogonally projected towards gravity to get color, depth, and mask heightmaps. DenseNet processes these heightmaps to extract features for training the deep Q-networks, which predict push and grasp Q maps. These predictions and relative domain knowledge (which is not depicted in the figure) are then input into the coordinator to determine the optimal action from the best push and grasp.
  • Figure 3: Training performance. The green and yellow lines respectively indicate the changes in the success rate and total grasp/push attempts of our model during the training process, based on measurements of 15 objects.
  • Figure 4: Performance on different occlusion ratios test. The task success rate (upper) and total grasp/push attempts (lower) of five approaches on different occlusion ratios test. Our method outperforms other methods by at least 1% in success rate and by at least 0.61 in total grasp/push attempts in the most complex case (0.6-0.8 occlusion).
  • Figure 5: Performance on challenging occlusion cases, specifically success rate and total grasp/push attempts for each method. Our method leads the second-place method GE-GRASP liu2022ge by 1% in terms of success rate and by 0.79 in terms of total grasp/push attempts on average.
  • ...and 4 more figures