Table of Contents
Fetching ...

RLBench: The Robot Learning Benchmark & Learning Environment

Stephen James, Zicong Ma, David Rovick Arrojo, Andrew J. Davison

TL;DR

RLBench addresses the lack of standardized, scalable benchmarks for visually guided robotic manipulation by offering 100 diverse tasks with rich sensor data and an infinite supply of motion-planned demonstrations. The framework provides a task-builder and a PyRep-based environment to support end-to-end evaluation across reinforcement learning, imitation learning, and few-shot learning, including a formal 1.0 version of a large-scale few-shot challenge. Its design emphasizes diversity, reproducibility, scalability, extensibility, graded difficulty, and realism, aiming to unify traditional and learning-based methods under a common platform. By enabling end-to-end evaluation, sim-to-real transfer considerations, and easy task extension, RLBench seeks to accelerate progress across a broad range of robotics research domains.

Abstract

We present a challenging new benchmark and learning-environment for robot learning: RLBench. The benchmark features 100 completely unique, hand-designed tasks ranging in difficulty, from simple target reaching and door opening, to longer multi-stage tasks, such as opening an oven and placing a tray in it. We provide an array of both proprioceptive observations and visual observations, which include rgb, depth, and segmentation masks from an over-the-shoulder stereo camera and an eye-in-hand monocular camera. Uniquely, each task comes with an infinite supply of demos through the use of motion planners operating on a series of waypoints given during task creation time; enabling an exciting flurry of demonstration-based learning. RLBench has been designed with scalability in mind; new tasks, along with their motion-planned demos, can be easily created and then verified by a series of tools, allowing users to submit their own tasks to the RLBench task repository. This large-scale benchmark aims to accelerate progress in a number of vision-guided manipulation research areas, including: reinforcement learning, imitation learning, multi-task learning, geometric computer vision, and in particular, few-shot learning. With the benchmark's breadth of tasks and demonstrations, we propose the first large-scale few-shot challenge in robotics. We hope that the scale and diversity of RLBench offers unparalleled research opportunities in the robot learning community and beyond.

RLBench: The Robot Learning Benchmark & Learning Environment

TL;DR

RLBench addresses the lack of standardized, scalable benchmarks for visually guided robotic manipulation by offering 100 diverse tasks with rich sensor data and an infinite supply of motion-planned demonstrations. The framework provides a task-builder and a PyRep-based environment to support end-to-end evaluation across reinforcement learning, imitation learning, and few-shot learning, including a formal 1.0 version of a large-scale few-shot challenge. Its design emphasizes diversity, reproducibility, scalability, extensibility, graded difficulty, and realism, aiming to unify traditional and learning-based methods under a common platform. By enabling end-to-end evaluation, sim-to-real transfer considerations, and easy task extension, RLBench seeks to accelerate progress across a broad range of robotics research domains.

Abstract

We present a challenging new benchmark and learning-environment for robot learning: RLBench. The benchmark features 100 completely unique, hand-designed tasks ranging in difficulty, from simple target reaching and door opening, to longer multi-stage tasks, such as opening an oven and placing a tray in it. We provide an array of both proprioceptive observations and visual observations, which include rgb, depth, and segmentation masks from an over-the-shoulder stereo camera and an eye-in-hand monocular camera. Uniquely, each task comes with an infinite supply of demos through the use of motion planners operating on a series of waypoints given during task creation time; enabling an exciting flurry of demonstration-based learning. RLBench has been designed with scalability in mind; new tasks, along with their motion-planned demos, can be easily created and then verified by a series of tools, allowing users to submit their own tasks to the RLBench task repository. This large-scale benchmark aims to accelerate progress in a number of vision-guided manipulation research areas, including: reinforcement learning, imitation learning, multi-task learning, geometric computer vision, and in particular, few-shot learning. With the benchmark's breadth of tasks and demonstrations, we propose the first large-scale few-shot challenge in robotics. We hope that the scale and diversity of RLBench offers unparalleled research opportunities in the robot learning community and beyond.

Paper Structure

This paper contains 25 sections, 7 figures.

Figures (7)

  • Figure 1: RLBench is a large-scale benchmark consisting of 100 completely unique, hand-designed tasks. In this figure we show a sample of 24 tasks that feature in the benchmark. Example tasks include stacking a set of 6 colored blocks in a pyramid (top left), inserting a shape onto a peg (top right), finish setting up a checkers board (bottom left), and watering a plant (bottom right). To get a better understanding of the variety of tasks, please watch the video.
  • Figure 2: The V-REP scene consists of a Franka Panda affixed to a wooden table, surrounded by 3 directional lights. Observations include rgb, depth, and segmentation masks from an over-the-shoulder stereo camera and a eye-in-hand monocular camera, along with robot proprioceptive data, which includes joint angles, velocities, and torques, and the gripper pose. The arm can be easily swapped out for another arm if required.
  • Figure 3: A sample of the visual observations given from both the over-the-shoulder stereo and eye-in-hand monocular cameras, which supply rgb, depth, and mask images.
  • Figure 4: An example showing the distinction between task, variation, and episode. In this case, the 'stack_blocks' task has $V$ variations, each with $E$ episodes. Each variation comes with a list of textual descriptions that describes the objective. Across variations, usually target objects or colours are changed, whereas across episodes positions are changed.
  • Figure 5: Example usage of the RLBench Environment for training a reinforcement learning agent. When using demonstrations, users can either point to a set of saved demonstrations (as shown here), or alternatively generate demonstrations on the fly.
  • ...and 2 more figures