Learning to enhance multi-legged robot on rugged landscapes

Juntao He; Baxi Chong; Zhaochen Xu; Sehoon Ha; Daniel I. Goldman

Learning to enhance multi-legged robot on rugged landscapes

Juntao He, Baxi Chong, Zhaochen Xu, Sehoon Ha, Daniel I. Goldman

TL;DR

The paper tackles robust locomotion of multi-legged robots on rugged landscapes by combining a physics-based MuJoCo simulator with a learning-based controller that coordinates three gait amplitudes: leg stepping ($\Theta_{leg}$), horizontal body undulation ($\Theta_{body}$), and vertical body undulation ($A_v$). By training with PPO and employing domain randomization, the policy learns to optimize amplitude coordination in real time using the ground-foot contact state ($\beta$), resulting in substantial speed gains over a linear controller that modulates only $A_v$. The approach is validated through a closed-loop pipeline: simulated data, lab-based experiments, and outdoor field tests, all showing ~50% improvements in forward speed and modest yaw disruption. This work advances terrain-adaptive locomotion for high-static-stability, multi-legged systems and provides a scalable framework for sim-to-real transfer in complex environments.

Abstract

Navigating rugged landscapes poses significant challenges for legged locomotion. Multi-legged robots (those with 6 and greater) offer a promising solution for such terrains, largely due to their inherent high static stability, resulting from a low center of mass and wide base of support. Such systems require minimal effort to maintain balance. Recent studies have shown that a linear controller, which modulates the vertical body undulation of a multi-legged robot in response to shifts in terrain roughness, can ensure reliable mobility on challenging terrains. However, the potential of a learning-based control framework that adjusts multiple parameters to address terrain heterogeneity remains underexplored. We posit that the development of an experimentally validated physics-based simulator for this robot can rapidly advance capabilities by allowing wide parameter space exploration. Here we develop a MuJoCo-based simulator tailored to this robotic platform and use the simulation to develop a reinforcement learning-based control framework that dynamically adjusts horizontal and vertical body undulation, and limb stepping in real-time. Our approach improves robot performance in simulation, laboratory experiments, and outdoor tests. Notably, our real-world experiments reveal that the learning-based controller achieves a 30\% to 50\% increase in speed compared to a linear controller, which only modulates vertical body waves. We hypothesize that the superior performance of the learning-based controller arises from its ability to adjust multiple parameters simultaneously, including limb stepping, horizontal body wave, and vertical body wave.

Learning to enhance multi-legged robot on rugged landscapes

TL;DR

), horizontal body undulation (

), and vertical body undulation (

). By training with PPO and employing domain randomization, the policy learns to optimize amplitude coordination in real time using the ground-foot contact state (

), resulting in substantial speed gains over a linear controller that modulates only

. The approach is validated through a closed-loop pipeline: simulated data, lab-based experiments, and outdoor field tests, all showing ~50% improvements in forward speed and modest yaw disruption. This work advances terrain-adaptive locomotion for high-static-stability, multi-legged systems and provides a scalable framework for sim-to-real transfer in complex environments.

Abstract

Paper Structure (17 sections, 6 equations, 7 figures)

This paper contains 17 sections, 6 equations, 7 figures.

Introduction
Related work
Wave patterns in multi-legged robot
Feedback control for multi-legged robot
Reinforcement learning for legged locomotion
Method
Multi-legged robot simulator
Learning-based control policy
Background: Markov Decision Process
Learning formulation
Domain Randomization
Implementation Details
Results
Simulation results
Lab-based test
...and 2 more sections

Figures (7)

Figure 1: Outdoor experiments demonstrate a significant improvement in a multi-legged robot's speed by implementing a learning-based controller. On terrain composed of a mixture of bush, fern, and pine straw, the learning-based controller achieves a 50% increase in speed compared to the linear controller.
Figure 2: Simulation validation on Flat Ground. A. Marker assembly locations on the robot. B. Snapshots from both the real-world experiment and the simulation. C. This figure illustrates the displacement over time for three markers, comparing the results from both the simulation and the real-world experiments.
Figure 3: Simulation validation on rough terrain. A. Terrain with different roughness. B. Screenshot depicting the robot moving forward with varying vertical amplitudes ($A_v$) on the $R_g=0.17$ terrain. C. Displacement versus time for robot moving on $R_g=0.17$ terrain with $A_v=20^{\circ}$. D.Velocity versus vertical amplitude plot comparing simulation and real-world experiments.
Figure 4: Control frameworks for the linear and learning-based controllers. A. Linear controller: This controller modulates the vertical body undulation wave based on real-time ground-foot contact data from the sensors (contact ratio $\beta$). Here $A_v$ represents the amplitude of the vertical body wave, and $\theta_v$ denotes the joint angle of the vertical body joint. The parameter $K_p$ refers to the proportional gain of the linear controller, and $\Delta \beta$ indicates the discrepancy between the actual and expected contact ratios.B. Learning-based controller: This controller adjusts limb stepping and both horizontal and vertical body undulation waves, based on the amplitudes of these three waves alongside real-time ground-foot contact ratio ($\beta$). In this case, $\Theta_{leg}$ and $\Theta_{body}$ correspond to the amplitudes of the leg and body waves, respectively, while $\theta_{leg}$ and $\theta_{body}$ represent the leg joint angle and the horizontal body angle, respectively.
Figure 5: Rough terrain variation in simulation and simulation results.A. The roughness of the terrain in simulation is varied by adjusting the parameter $\sigma$, which modifies the standard deviation of block heights, effectively randomizing the terrain conditions. B. The simulation results present a comparison between the performance of the learning-based controller and the linear controller. The average speed per cycle, denoted as $\bar{v}$, is used as the performance metric.
...and 2 more figures

Learning to enhance multi-legged robot on rugged landscapes

TL;DR

Abstract

Learning to enhance multi-legged robot on rugged landscapes

Authors

TL;DR

Abstract

Table of Contents

Figures (7)