Efficient Learning-Based Control of a Legged Robot in Lunar Gravity
Philip Arm, Oliver Fischer, Joseph Church, Adrian Fuhrer, Hendrik Kolvenbach, Marco Hutter
TL;DR
The paper tackles energy-efficient legged locomotion for planetary exploration by introducing gravity-aware reward scaling and power-regularized reinforcement learning. It trains two PPO-based controllers—one for locomotion and one for base pose—on a Magnecko quadruped and validates across gravity levels from lunar to super-Earth, aided by a passive gravity-offload system for real-world testing. Key contributions include a gravity-based reward scaling law with a gravity factor $\alpha_g=\frac{g_E}{g}$ and a separate power-regularization term that models drivetrain losses, plus demonstration of cross-gravity policies both in simulation and on hardware and a practical offload setup for lunar gravity experimentation. The results show the power-optimized policies achieve substantial energy savings (e.g., ~23% in Earth and ~36% in lunar tests) and provide a scalable approach for developing gravity-robust, energy-efficient locomotion for legged robots in planetary missions, while highlighting the need to refine scaling laws for high-gravity scenarios.
Abstract
Legged robots are promising candidates for exploring challenging areas on low-gravity bodies such as the Moon, Mars, or asteroids, thanks to their advanced mobility on unstructured terrain. However, as planetary robots' power and thermal budgets are highly restricted, these robots need energy-efficient control approaches that easily transfer to multiple gravity environments. In this work, we introduce a reinforcement learning-based control approach for legged robots with gravity-scaled power-optimized reward functions. We use our approach to develop and validate a locomotion controller and a base pose controller in gravity environments from lunar gravity (1.62 m/s2) to a hypothetical super-Earth (19.62 m/s2). Our approach successfully scales across these gravity levels for locomotion and base pose control with the gravity-scaled reward functions. The power-optimized locomotion controller reached a power consumption for locomotion of 23.4 W in Earth gravity on a 15.65 kg robot at 0.4 m/s, a 23 % improvement over the baseline policy. Additionally, we designed a constant-force spring offload system that allowed us to conduct real-world experiments on legged locomotion in lunar gravity. In lunar gravity, the power-optimized control policy reached 12.2 W, 36 % less than a baseline controller which is not optimized for power efficiency. Our method provides a scalable approach to developing power-efficient locomotion controllers for legged robots across multiple gravity levels.
