Efficient Reinforcement Learning for Jumping Monopods
Riccardo Bussola, Michele Focchi, Andrea Del Prete, Daniele Fontanelli, Luigi Palopoli
TL;DR
This work tackles the challenge of omni-directional jumping for a monopod on uneven terrain by injecting physical knowledge into reinforcement learning. It constrains the action space to a 5D Cartesian-parametrised thrust plan expressed as a 3rd-order Bezier curve, learned via the TD3 algorithm and mapped to joints through inverse kinematics, with a gravity-compensated low-level controller executing the motion. The learning is guided by a physics-informed reward that penalizes constraint violations and rewards landing accuracy, enabling real-time-like performance and compensation for tracking errors. Compared to nonlinear trajectory optimization and end-to-end RL, the proposed approach yields larger feasible regions, faster training, and comparable or superior front-jump accuracy, while drastically reducing online computation and enabling generalisation to unseen targets.
Abstract
In this work, we consider the complex control problem of making a monopod reach a target with a jump. The monopod can jump in any direction and the terrain underneath its foot can be uneven. This is a template of a much larger class of problems, which are extremely challenging and computationally expensive to solve using standard optimisation-based techniques. Reinforcement Learning (RL) could be an interesting alternative, but the application of an end-to-end approach in which the controller must learn everything from scratch, is impractical. The solution advocated in this paper is to guide the learning process within an RL framework by injecting physical knowledge. This expedient brings to widespread benefits, such as a drastic reduction of the learning time, and the ability to learn and compensate for possible errors in the low-level controller executing the motion. We demonstrate the advantage of our approach with respect to both optimization-based and end-to-end RL approaches.
