Table of Contents
Fetching ...

RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning

Sabela Ramos, Sertan Girgin, Léonard Hussenot, Damien Vincent, Hanna Yakubovich, Daniel Toyama, Anita Gergely, Piotr Stanczyk, Raphael Marinier, Jeremiah Harmsen, Olivier Pietquin, Nikola Momchev

TL;DR

Reinforcement learning research suffers from data inefficiency and heterogeneous dataset formats that hinder reproducibility and cross-dataset evaluation. The authors present RLDS, an ecosystem comprising EnvLogger and RLDS Creator to generate data, a library of RL-specific transformations, and integration with TensorFlow Datasets to share datasets losslessly. The system preserves temporal and episodic structure (through SAR and RSA alignments) and supports annotations, human-in-the-loop data collection, and flexible pipelines for transforming data into algorithm-ready formats. They validate the approach with Robosuite datasets and provide open-source tools and notebooks to facilitate adoption, with the aim of accelerating SDM research through reproducible, shareable data.

Abstract

We introduce RLDS (Reinforcement Learning Datasets), an ecosystem for recording, replaying, manipulating, annotating and sharing data in the context of Sequential Decision Making (SDM) including Reinforcement Learning (RL), Learning from Demonstrations, Offline RL or Imitation Learning. RLDS enables not only reproducibility of existing research and easy generation of new datasets, but also accelerates novel research. By providing a standard and lossless format of datasets it enables to quickly test new algorithms on a wider range of tasks. The RLDS ecosystem makes it easy to share datasets without any loss of information and to be agnostic to the underlying original format when applying various data processing pipelines to large collections of datasets. Besides, RLDS provides tools for collecting data generated by either synthetic agents or humans, as well as for inspecting and manipulating the collected data. Ultimately, integration with TFDS facilitates the sharing of RL datasets with the research community.

RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning

TL;DR

Reinforcement learning research suffers from data inefficiency and heterogeneous dataset formats that hinder reproducibility and cross-dataset evaluation. The authors present RLDS, an ecosystem comprising EnvLogger and RLDS Creator to generate data, a library of RL-specific transformations, and integration with TensorFlow Datasets to share datasets losslessly. The system preserves temporal and episodic structure (through SAR and RSA alignments) and supports annotations, human-in-the-loop data collection, and flexible pipelines for transforming data into algorithm-ready formats. They validate the approach with Robosuite datasets and provide open-source tools and notebooks to facilitate adoption, with the aim of accelerating SDM research through reproducible, shareable data.

Abstract

We introduce RLDS (Reinforcement Learning Datasets), an ecosystem for recording, replaying, manipulating, annotating and sharing data in the context of Sequential Decision Making (SDM) including Reinforcement Learning (RL), Learning from Demonstrations, Offline RL or Imitation Learning. RLDS enables not only reproducibility of existing research and easy generation of new datasets, but also accelerates novel research. By providing a standard and lossless format of datasets it enables to quickly test new algorithms on a wider range of tasks. The RLDS ecosystem makes it easy to share datasets without any loss of information and to be agnostic to the underlying original format when applying various data processing pipelines to large collections of datasets. Besides, RLDS provides tools for collecting data generated by either synthetic agents or humans, as well as for inspecting and manipulating the collected data. Ultimately, integration with TFDS facilitates the sharing of RL datasets with the research community.

Paper Structure

This paper contains 20 sections, 11 figures.

Figures (11)

  • Figure 1: RLDS takes advantage of the inherently standard structure of RL datasets and represents them as a dataset of episodes where each of the episodes contains a nested dataset of steps.
  • Figure 2: The Environment Logger
  • Figure 3: Some Environments supported by RLDS Creator: Atari games, DMLab (3D learning environment based on id Software's Quake III Arena), NetHack (single-player text-only roguelike game), Procgen (procedurally-generated 2D games), Robodesk and Robosuite (robot arm).
  • Figure 4: The histograms produced by the code snippet above.
  • Figure 5: The communication protocol between the client and the server in RLDS Creator.
  • ...and 6 more figures