Open-Source Reinforcement Learning Environments Implemented in MuJoCo with Franka Manipulator

Zichun Xu; Yuntao Li; Xiaohang Yang; Zhiyuan Zhao; Lei Zhuang; Jingdong Zhao

Open-Source Reinforcement Learning Environments Implemented in MuJoCo with Franka Manipulator

Zichun Xu, Yuntao Li, Xiaohang Yang, Zhiyuan Zhao, Lei Zhuang, Jingdong Zhao

TL;DR

This work addresses the shortage of realistic, open benchmarks for robotic manipulation in reinforcement learning by introducing three open-source MuJoCo-based environments built around the Franka Panda arm (MuJoCo Menagerie) and available via Gymnasium Robotics. The tasks—FrankaPush, FrankaSlide, and FrankaPickAndPlace—adopt a Multi-Goal RL framework with sparse and dense rewards and goal-conditioned observations, enabling consistent benchmarking. The authors validate fidelity and task difficulty by training three off-policy algorithms—DDPG, SAC, and TQC—with Hindsight Experience Replay, finding that distributional methods (TQC) typically outperform DDPG and SAC across tasks. The resulting benchmarks, coupled with detailed XML configurations and hyperparameters, offer a lightweight, CPU-friendly, and reproducible platform to evaluate reinforcement learning algorithms for robotic manipulation, with potential future enhancements including impedance control and finer timesteps.

Abstract

This paper presents three open-source reinforcement learning environments developed on the MuJoCo physics engine with the Franka Emika Panda arm in MuJoCo Menagerie. Three representative tasks, push, slide, and pick-and-place, are implemented through the Gymnasium Robotics API, which inherits from the core of Gymnasium. Both the sparse binary and dense rewards are supported, and the observation space contains the keys of desired and achieved goals to follow the Multi-Goal Reinforcement Learning framework. Three different off-policy algorithms are used to validate the simulation attributes to ensure the fidelity of all tasks, and benchmark results are also given. Each environment and task are defined in a clean way, and the main parameters for modifying the environment are preserved to reflect the main difference. The repository, including all environments, is available at https://github.com/zichunxx/panda_mujoco_gym.

Open-Source Reinforcement Learning Environments Implemented in MuJoCo with Franka Manipulator

TL;DR

Abstract

Paper Structure (9 sections, 13 equations, 5 figures, 2 tables)

This paper contains 9 sections, 13 equations, 5 figures, 2 tables.

INTRODUCTION
PRELIMINARIES
Deep Deterministic Policy Gradient
Soft Actor Critic
Truncated Quantile Critics
ENVIRONMENTS
EVALUATION
CONCLUSION AND FUTURE WORK
ACKNOWLEDGMENT

Figures (5)

Figure 1: Base simulation environment implemented with the Franka model in MuJoCo Menagerie.
Figure 2: Overview of the benchmark environment, including registration and the training process.
Figure 3: Three proposed environments with the Franka model in MuJoCo Menagerie, in which the red point indicates the target.
Figure 4: Median success rates and standard deviations indicated by shaded areas over three random seeds.
Figure 5: The execution processes of different tasks via the trained policies.

Open-Source Reinforcement Learning Environments Implemented in MuJoCo with Franka Manipulator

TL;DR

Abstract

Open-Source Reinforcement Learning Environments Implemented in MuJoCo with Franka Manipulator

Authors

TL;DR

Abstract

Table of Contents

Figures (5)