Table of Contents
Fetching ...

NeuronsGym: A Hybrid Framework and Benchmark for Robot Tasks with Sim2Real Policy Learning

Haoran Li, Shasha Liu, Mingjun Ma, Guangzheng Hu, Yaran Chen, Dongbin Zhao

TL;DR

A hybrid framework named NeuronsGym is presented that can be used for policy learning of robot tasks, covering a simulation platform for training policy, and a physical system for studying sim2real problems.

Abstract

The rise of embodied AI has greatly improved the possibility of general mobile agent systems. At present, many evaluation platforms with rich scenes, high visual fidelity and various application scenarios have been developed. In this paper, we present a hybrid framework named NeuronsGym that can be used for policy learning of robot tasks, covering a simulation platform for training policy, and a physical system for studying sim2real problems. Unlike most current single-task, slow-moving robotic platforms, our framework provides agile physical robots with a wider range of speeds, and can be employed to train robotic navigation and confrontation policies. At the same time, in order to evaluate the safety of robot navigation, we propose a safety-weighted path length (SFPL) to improve the safety evaluation in the current mobile robot navigation. Based on this platform, we build a new benchmark for navigation and confrontation tasks under this platform by comparing the current mainstream sim2real methods, and hold the 2022 IEEE Conference on Games (CoG) RoboMaster sim2real challenge. We release the codes of this framework\footnote{\url{https://github.com/DRL-CASIA/NeuronsGym}} and hope that this platform can promote the development of more flexible and agile general mobile agent algorithms.

NeuronsGym: A Hybrid Framework and Benchmark for Robot Tasks with Sim2Real Policy Learning

TL;DR

A hybrid framework named NeuronsGym is presented that can be used for policy learning of robot tasks, covering a simulation platform for training policy, and a physical system for studying sim2real problems.

Abstract

The rise of embodied AI has greatly improved the possibility of general mobile agent systems. At present, many evaluation platforms with rich scenes, high visual fidelity and various application scenarios have been developed. In this paper, we present a hybrid framework named NeuronsGym that can be used for policy learning of robot tasks, covering a simulation platform for training policy, and a physical system for studying sim2real problems. Unlike most current single-task, slow-moving robotic platforms, our framework provides agile physical robots with a wider range of speeds, and can be employed to train robotic navigation and confrontation policies. At the same time, in order to evaluate the safety of robot navigation, we propose a safety-weighted path length (SFPL) to improve the safety evaluation in the current mobile robot navigation. Based on this platform, we build a new benchmark for navigation and confrontation tasks under this platform by comparing the current mainstream sim2real methods, and hold the 2022 IEEE Conference on Games (CoG) RoboMaster sim2real challenge. We release the codes of this framework\footnote{\url{https://github.com/DRL-CASIA/NeuronsGym}} and hope that this platform can promote the development of more flexible and agile general mobile agent algorithms.
Paper Structure (47 sections, 18 equations, 10 figures, 6 tables)

This paper contains 47 sections, 18 equations, 10 figures, 6 tables.

Figures (10)

  • Figure 1: Overview of the hybrid framework - NeuronsGym. The framework is composed of simulation and physical systems. Agents can interact with simulation systems or physical systems through communication protocols to achieve agent training or evaluation. The agent policy can access the parameter manager to adjust the parameters of the robot model or environment in the simulation system. In addition, the same scenario and task are set in each system to study the sim2real of the robot policy.
  • Figure 2: Friction force analysis of the robot wheel.
  • Figure 3: Simulation arenas. The left is the arena established with the built-in geometric modules in Unity3D, and the right is the arena built by RealtyCapture with the real arena images.
  • Figure 4: Several different sim2real methods. In the training process, each trial starts by sampling the simulator parameters from the parameter generator, then uses the simulator to train the agent, and then transfers to the real robot. The difference is that the parameter generators of $(a)$ and $(e)$ get the same results each time, and $(b) - (d)$ sample parameters from some distribution, and the results are different each time. Here, $(b)$ does not require actual robot data, $(c)$ requires offline robot data, and $(d)$ requires online robot data. Unlike $(c)$ and $(d)$, $(e)$ uses offline data to learn the action transformer to correct the state generated by the simulator.
  • Figure 5: Blocking zones and evaluation scenarios. The green squares are the blocking zones. From left to right are the samples of the evaluation scenarios from Level 1, Level 2, and Level 3.
  • ...and 5 more figures