Autonomous Control of Redundant Hydraulic Manipulator Using Reinforcement Learning with Action Feedback
Rohit Dhakate, Christian Brommer, Christoph Böhm, Stephan Weiss, Jan Steinbrener
TL;DR
This work demonstrates a fully data-driven approach to autonomous control of a redundant hydraulic manipulator, enabling direct Sim-2-Real deployment by emulating actuator dynamics and learning an end-effector–to–actuator policy with reinforcement learning. The method combines two supervised nets (actuator and forward) with a DDPG-based RL controller, enhanced by action-feedback from the forward model to guide exploration. The approach achieves 3D trajectory tracking in simulation and transfers to a real forestry crane with reasonable accuracy, despite unmodeled sway and backlash, underscoring the practicality of data-driven, model-free control for hydraulic heavy machinery. The results highlight the potential for efficient automation of complex, non-linear actuated systems with minimal model information, paving the way for robust Sim-2-Real deployment and future improvements in sway compensation and curriculum-based learning.
Abstract
This article presents an entirely data-driven approach for autonomous control of redundant manipulators with hydraulic actuation. The approach only requires minimal system information, which is inherited from a simulation model. The non-linear hydraulic actuation dynamics are modeled using actuator networks from the data gathered during the manual operation of the manipulator to effectively emulate the real system in a simulation environment. A neural network control policy for autonomous control, based on end-effector (EE) position tracking is then learned using Reinforcement Learning (RL) with Ornstein-Uhlenbeck process noise (OUNoise) for efficient exploration. The RL agent also receives feedback based on supervised learning of the forward kinematics which facilitates selecting the best suitable action from exploration. The control policy directly provides the joint variables as outputs based on provided target EE position while taking into account the system dynamics. The joint variables are then mapped to the hydraulic valve commands, which are then fed to the system without further modifications. The proposed approach is implemented on a scaled hydraulic forwarder crane with three revolute and one prismatic joint to track the desired position of the EE in 3-Dimensional (3D) space. With the emulated dynamics and extensive learning in simulation, the results demonstrate the feasibility of deploying the learned controller directly on the real system.
