Reinforcement Learning for Solving Robotic Reaching Tasks in the Neurorobotics Platform
Marton Szep, Leander Lauenburg, Kevin Farkas, Xiyan Su, Chuanlong Zang
TL;DR
The paper tackles robotic reaching with reinforcement learning on the Neurorobotics Platform, addressing safety and data-efficiency by comparing model-free agents (DDPG, TD3, SAC) under curriculum learning. It demonstrates that curriculum-guided TD3 with dense rewards and HER, especially in a four-joint configuration, achieves high precision (≈2.4 cm) and high success rates (≈92% at 5 cm threshold), while six joints incur higher difficulty. The study also analyzes learning from image data, showing that top-down 2D localization aids learning but manual or CNN-based ground-truth extraction remains imperfect, and autoencoder latent representations underperform. Overall, the work provides a comprehensive assessment of model-free RL for a reaching task in neurorobotics, highlights practical design choices (reward shaping, HER, action-space constraints, curriculum), and points to future improvements in vision-based control and representation learning.
Abstract
In recent years, reinforcement learning (RL) has shown great potential for solving tasks in well-defined environments like games or robotics. This paper aims to solve the robotic reaching task in a simulation run on the Neurorobotics Platform (NRP). The target position is initialized randomly and the robot has 6 degrees of freedom. We compare the performance of various state-of-the-art model-free algorithms. At first, the agent is trained on ground truth data from the simulation to reach the target position in only one continuous movement. Later the complexity of the task is increased by using image data as input from the simulation environment. Experimental results show that training efficiency and results can be improved with appropriate dynamic training schedule function for curriculum learning.
