Table of Contents
Fetching ...

PreAfford: Universal Affordance-Based Pre-Grasping for Diverse Objects and Environments

Kairui Ding, Boyuan Chen, Ruihai Wu, Yuyang Li, Zongzheng Zhang, Huan-ang Gao, Siqi Li, Guyue Zhou, Yixin Zhu, Hao Dong, Hao Zhao

TL;DR

PreAfford addresses the challenge of grasping in diverse objects and environments with two-finger grippers by coupling a relay-trained, two-module framework (pre-grasping and grasping) to a point-level affordance representation. Each module contains affordance, proposal, and critic networks, operating on RGB-D inputs processed by PointNet++ to produce action proposals evaluated in a closed-loop fashion, with a data-driven reward signal guiding pre-grasping. Trained offline on ShapeNet-v2 and validated through both simulations and real-world experiments, PreAfford achieves up to a 69% improvement in grasping success on unseen categories and demonstrates practical deployability across multiple setups. The work advances scene-aware, geometry-conscious manipulation for robust handling of a wide range of objects and environments, while highlighting avenues for improving robustness to irregular shapes and dynamic contexts.

Abstract

Robotic manipulation with two-finger grippers is challenged by objects lacking distinct graspable features. Traditional pre-grasping methods, which typically involve repositioning objects or utilizing external aids like table edges, are limited in their adaptability across different object categories and environments. To overcome these limitations, we introduce PreAfford, a novel pre-grasping planning framework incorporating a point-level affordance representation and a relay training approach. Our method significantly improves adaptability, allowing effective manipulation across a wide range of environments and object types. When evaluated on the ShapeNet-v2 dataset, PreAfford not only enhances grasping success rates by 69% but also demonstrates its practicality through successful real-world experiments. These improvements highlight PreAfford's potential to redefine standards for robotic handling of complex manipulation tasks in diverse settings.

PreAfford: Universal Affordance-Based Pre-Grasping for Diverse Objects and Environments

TL;DR

PreAfford addresses the challenge of grasping in diverse objects and environments with two-finger grippers by coupling a relay-trained, two-module framework (pre-grasping and grasping) to a point-level affordance representation. Each module contains affordance, proposal, and critic networks, operating on RGB-D inputs processed by PointNet++ to produce action proposals evaluated in a closed-loop fashion, with a data-driven reward signal guiding pre-grasping. Trained offline on ShapeNet-v2 and validated through both simulations and real-world experiments, PreAfford achieves up to a 69% improvement in grasping success on unseen categories and demonstrates practical deployability across multiple setups. The work advances scene-aware, geometry-conscious manipulation for robust handling of a wide range of objects and environments, while highlighting avenues for improving robustness to irregular shapes and dynamic contexts.

Abstract

Robotic manipulation with two-finger grippers is challenged by objects lacking distinct graspable features. Traditional pre-grasping methods, which typically involve repositioning objects or utilizing external aids like table edges, are limited in their adaptability across different object categories and environments. To overcome these limitations, we introduce PreAfford, a novel pre-grasping planning framework incorporating a point-level affordance representation and a relay training approach. Our method significantly improves adaptability, allowing effective manipulation across a wide range of environments and object types. When evaluated on the ShapeNet-v2 dataset, PreAfford not only enhances grasping success rates by 69% but also demonstrates its practicality through successful real-world experiments. These improvements highlight PreAfford's potential to redefine standards for robotic handling of complex manipulation tasks in diverse settings.
Paper Structure (18 sections, 7 equations, 6 figures, 4 tables)

This paper contains 18 sections, 7 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Pre-grasping leverages environmental features to enhance graspability. (a) An object lying flat on the floor, ungraspable in its current position. (b) Side-grasping an object that overhangs a surface. (c) Grasping an angled part protruding from a slot. (d) Grasping the middle of a phone suspended at the foot of a slope. (e) Pinning a phone against a wall and grasping it from the opposite side.
  • Figure 2: The framework of PreAfford. The framework consists of two main modules, each incorporating three networks: an affordance network, a proposal network, and a critic network. These networks respectively handle tasks of choosing the contact point, generating a proposal, and evaluating the proposal. PointNet++ (PN++) and MLP are employed to process point clouds and facilitate decision-making. During the inference phase, both modules collaborate to develop strategies for pre-grasping and grasping. In contrast, during the training phase, the grasping module generates rewards for training the pre-grasping module, a process we refer to as relay.
  • Figure 3: Multi-feature scenario:PreAfford effectively addresses scenarios where multiple environmental features are present simultaneously.
  • Figure 4: Qualitative Results. We demonstrate pre-grasping manipulation on training and testing categories in four scenarios—edge, slot, slope, and wall. Affordance maps highlight effective interaction areas, showing PreAfford's capability to devise suitable pre-grasping and grasping strategies for various object categories and scenes, including both seen and unseen objects.
  • Figure 5: Real-world pre-grasping manipulations with affordance maps. Red areas in the maps indicate optimal pushing locations. Point clouds are captured by Femto Bolt.
  • ...and 1 more figures