Table of Contents
Fetching ...

FetchBench: A Simulation Benchmark for Robot Fetching

Beining Han, Meenal Parakh, Derek Geng, Jack A Defay, Gan Luyang, Jia Deng

TL;DR

This work proposes a new benchmark FetchBench, featuring diverse procedural scenes that integrate both grasping and motion planning challenges, and implements multiple baselines from the traditional sense-plan-act pipeline to end-to-end behavior models.

Abstract

Fetching, which includes approaching, grasping, and retrieving, is a critical challenge for robot manipulation tasks. Existing methods primarily focus on table-top scenarios, which do not adequately capture the complexities of environments where both grasping and planning are essential. To address this gap, we propose a new benchmark FetchBench, featuring diverse procedural scenes that integrate both grasping and motion planning challenges. Additionally, FetchBench includes a data generation pipeline that collects successful fetch trajectories for use in imitation learning methods. We implement multiple baselines from the traditional sense-plan-act pipeline to end-to-end behavior models. Our empirical analysis reveals that these methods achieve a maximum success rate of only 20%, indicating substantial room for improvement. Additionally, we identify key bottlenecks within the sense-plan-act pipeline and make recommendations based on the systematic analysis.

FetchBench: A Simulation Benchmark for Robot Fetching

TL;DR

This work proposes a new benchmark FetchBench, featuring diverse procedural scenes that integrate both grasping and motion planning challenges, and implements multiple baselines from the traditional sense-plan-act pipeline to end-to-end behavior models.

Abstract

Fetching, which includes approaching, grasping, and retrieving, is a critical challenge for robot manipulation tasks. Existing methods primarily focus on table-top scenarios, which do not adequately capture the complexities of environments where both grasping and planning are essential. To address this gap, we propose a new benchmark FetchBench, featuring diverse procedural scenes that integrate both grasping and motion planning challenges. Additionally, FetchBench includes a data generation pipeline that collects successful fetch trajectories for use in imitation learning methods. We implement multiple baselines from the traditional sense-plan-act pipeline to end-to-end behavior models. Our empirical analysis reveals that these methods achieve a maximum success rate of only 20%, indicating substantial room for improvement. Additionally, we identify key bottlenecks within the sense-plan-act pipeline and make recommendations based on the systematic analysis.
Paper Structure (29 sections, 11 figures, 9 tables)

This paper contains 29 sections, 11 figures, 9 tables.

Figures (11)

  • Figure 1: Examples of the fetching task in our benchmark. The red object is the target object that needs to be retrieved from the cluster to the free space in each task. Our scenes and tasks are generated with procedural rules, which mimic daily environments like shelves, cabinets, drawers, baskets, etc. The images are rendered with Isaac-Sim isaacsim.
  • Figure 2: Example of the fetching task, including the initial state, the success grasp state, and the task inputs of segmented point cloud and joint states.
  • Figure 3: Examples of procedural scenes designed with Infinigen raistrick2024infinigen_indoors assets (first row). A similar scene can be replicated in the real world with IKEA furniture (second row). The scenes are rendered in Infinigen.
  • Figure 4: The sense-plan-act pipeline commonly used in grasping frameworks sundermeyer2021contact.
  • Figure 5: The success rate of different methods by category.
  • ...and 6 more figures