Table of Contents
Fetching ...

S-RL Toolbox: Environments, Datasets and Evaluation Metrics for State Representation Learning

Antonin Raffin, Ashley Hill, René Traoré, Timothée Lesort, Natalia Díaz-Rodríguez, David Filliat

TL;DR

The paper tackles the lack of standardized evaluation for state representation learning in robotics by introducing the S-RL Toolbox, a suite of environments, data generators, tasks, metrics and visualization tools designed for fast, reproducible SRL benchmarking in RL. It surveys SRL approaches (auto-encoders, robotic priors, forward/inverse models, and their combinations) and demonstrates how these representations can be evaluated across mobile navigation and robotic-arm tasks using both qualitative visualizations and quantitative metrics like KNN-MSE, correlation and GTC. The framework integrates multiple RL algorithms (notably PPO) to quantify how representation quality affects learning performance, and provides detailed implementation guidance and datasets to facilitate replication. Overall, the toolbox enables rapid iteration, interpretability, and fair comparisons of SRL methods in robotics control, with demonstrated speed and scalability for large-scale evaluation.

Abstract

State representation learning aims at learning compact representations from raw observations in robotics and control applications. Approaches used for this objective are auto-encoders, learning forward models, inverse dynamics or learning using generic priors on the state characteristics. However, the diversity in applications and methods makes the field lack standard evaluation datasets, metrics and tasks. This paper provides a set of environments, data generators, robotic control tasks, metrics and tools to facilitate iterative state representation learning and evaluation in reinforcement learning settings.

S-RL Toolbox: Environments, Datasets and Evaluation Metrics for State Representation Learning

TL;DR

The paper tackles the lack of standardized evaluation for state representation learning in robotics by introducing the S-RL Toolbox, a suite of environments, data generators, tasks, metrics and visualization tools designed for fast, reproducible SRL benchmarking in RL. It surveys SRL approaches (auto-encoders, robotic priors, forward/inverse models, and their combinations) and demonstrates how these representations can be evaluated across mobile navigation and robotic-arm tasks using both qualitative visualizations and quantitative metrics like KNN-MSE, correlation and GTC. The framework integrates multiple RL algorithms (notably PPO) to quantify how representation quality affects learning performance, and provides detailed implementation guidance and datasets to facilitate replication. Overall, the toolbox enables rapid iteration, interpretability, and fair comparisons of SRL methods in robotics control, with demonstrated speed and scalability for large-scale evaluation.

Abstract

State representation learning aims at learning compact representations from raw observations in robotics and control applications. Approaches used for this objective are auto-encoders, learning forward models, inverse dynamics or learning using generic priors on the state characteristics. However, the diversity in applications and methods makes the field lack standard evaluation datasets, metrics and tasks. This paper provides a set of environments, data generators, robotic control tasks, metrics and tools to facilitate iterative state representation learning and evaluation in reinforcement learning settings.

Paper Structure

This paper contains 28 sections, 5 equations, 5 figures, 9 tables.

Figures (5)

  • Figure 1: Environments and datasets for state representation learning.
  • Figure 2: Visual tools for analysing SRL; Left: Live trajectory of the robot in the state space. Centre: 3D scatter plot of a state space; clicking on any point displays the corresponding observation. Right: reconstruction of the point in the state space defined by the sliders. See complementary material for videos.
  • Figure 3: Correlation matrix for mobile robot navigation dataset (static target), between each dimension $s_i$ of predicted states $s$ and the ground truth $\tilde{s}_j$. We consider the ground truth to be the agent's real position. The states (dimension=2) are learned by combining a forward and an inverse model.
  • Figure 4: Performance (mean and standard error for 10 runs) for PPO algorithm for different state representations learned in mobile-robot-navigation (random target) environment
  • Figure 5: Performance (mean and standard error) on RL algorithms using ground truth states with mobile robot (random target) environment