Table of Contents
Fetching ...

Efficient Learning-Based Control of a Legged Robot in Lunar Gravity

Philip Arm, Oliver Fischer, Joseph Church, Adrian Fuhrer, Hendrik Kolvenbach, Marco Hutter

TL;DR

The paper tackles energy-efficient legged locomotion for planetary exploration by introducing gravity-aware reward scaling and power-regularized reinforcement learning. It trains two PPO-based controllers—one for locomotion and one for base pose—on a Magnecko quadruped and validates across gravity levels from lunar to super-Earth, aided by a passive gravity-offload system for real-world testing. Key contributions include a gravity-based reward scaling law with a gravity factor $\alpha_g=\frac{g_E}{g}$ and a separate power-regularization term that models drivetrain losses, plus demonstration of cross-gravity policies both in simulation and on hardware and a practical offload setup for lunar gravity experimentation. The results show the power-optimized policies achieve substantial energy savings (e.g., ~23% in Earth and ~36% in lunar tests) and provide a scalable approach for developing gravity-robust, energy-efficient locomotion for legged robots in planetary missions, while highlighting the need to refine scaling laws for high-gravity scenarios.

Abstract

Legged robots are promising candidates for exploring challenging areas on low-gravity bodies such as the Moon, Mars, or asteroids, thanks to their advanced mobility on unstructured terrain. However, as planetary robots' power and thermal budgets are highly restricted, these robots need energy-efficient control approaches that easily transfer to multiple gravity environments. In this work, we introduce a reinforcement learning-based control approach for legged robots with gravity-scaled power-optimized reward functions. We use our approach to develop and validate a locomotion controller and a base pose controller in gravity environments from lunar gravity (1.62 m/s2) to a hypothetical super-Earth (19.62 m/s2). Our approach successfully scales across these gravity levels for locomotion and base pose control with the gravity-scaled reward functions. The power-optimized locomotion controller reached a power consumption for locomotion of 23.4 W in Earth gravity on a 15.65 kg robot at 0.4 m/s, a 23 % improvement over the baseline policy. Additionally, we designed a constant-force spring offload system that allowed us to conduct real-world experiments on legged locomotion in lunar gravity. In lunar gravity, the power-optimized control policy reached 12.2 W, 36 % less than a baseline controller which is not optimized for power efficiency. Our method provides a scalable approach to developing power-efficient locomotion controllers for legged robots across multiple gravity levels.

Efficient Learning-Based Control of a Legged Robot in Lunar Gravity

TL;DR

The paper tackles energy-efficient legged locomotion for planetary exploration by introducing gravity-aware reward scaling and power-regularized reinforcement learning. It trains two PPO-based controllers—one for locomotion and one for base pose—on a Magnecko quadruped and validates across gravity levels from lunar to super-Earth, aided by a passive gravity-offload system for real-world testing. Key contributions include a gravity-based reward scaling law with a gravity factor and a separate power-regularization term that models drivetrain losses, plus demonstration of cross-gravity policies both in simulation and on hardware and a practical offload setup for lunar gravity experimentation. The results show the power-optimized policies achieve substantial energy savings (e.g., ~23% in Earth and ~36% in lunar tests) and provide a scalable approach for developing gravity-robust, energy-efficient locomotion for legged robots in planetary missions, while highlighting the need to refine scaling laws for high-gravity scenarios.

Abstract

Legged robots are promising candidates for exploring challenging areas on low-gravity bodies such as the Moon, Mars, or asteroids, thanks to their advanced mobility on unstructured terrain. However, as planetary robots' power and thermal budgets are highly restricted, these robots need energy-efficient control approaches that easily transfer to multiple gravity environments. In this work, we introduce a reinforcement learning-based control approach for legged robots with gravity-scaled power-optimized reward functions. We use our approach to develop and validate a locomotion controller and a base pose controller in gravity environments from lunar gravity (1.62 m/s2) to a hypothetical super-Earth (19.62 m/s2). Our approach successfully scales across these gravity levels for locomotion and base pose control with the gravity-scaled reward functions. The power-optimized locomotion controller reached a power consumption for locomotion of 23.4 W in Earth gravity on a 15.65 kg robot at 0.4 m/s, a 23 % improvement over the baseline policy. Additionally, we designed a constant-force spring offload system that allowed us to conduct real-world experiments on legged locomotion in lunar gravity. In lunar gravity, the power-optimized control policy reached 12.2 W, 36 % less than a baseline controller which is not optimized for power efficiency. Our method provides a scalable approach to developing power-efficient locomotion controllers for legged robots across multiple gravity levels.

Paper Structure

This paper contains 14 sections, 13 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Our robot can efficiently walk in earth gravity (top left) and in a lunar gravity test setup (bottom left) using gravity-scaled power-optimized control policies. Additionally, we created a base pose tracking controller that can operate in the same environments (top and bottom right).
  • Figure 2: Overview of our training setup: We use PPO schulman2017proximal with an asymmetric actor-critic setup. Scaling the reward function with gravity based on first principles allows our setup to scale across multiple gravity levels.
  • Figure 3: Power losses in the drivetrain. Recuperation loss occurs because not all the kinetic energy is effectively converted to electrical energy when the drivetrain is braking.
  • Figure 4: Our constant force spring offload system (left) allows us to test lunar locomotion policies on the real robot. We mounted the system on a wheeled gantry (right) to conduct locomotion tests and measure the robot's power consumption during locomotion.