GenLoco: Generalized Locomotion Controllers for Quadrupedal Robots

Gilbert Feng; Hongbo Zhang; Zhongyu Li; Xue Bin Peng; Bhuvan Basireddy; Linzhu Yue; Zhitao Song; Lizhi Yang; Yunhui Liu; Koushil Sreenath; Sergey Levine

GenLoco: Generalized Locomotion Controllers for Quadrupedal Robots

Gilbert Feng, Hongbo Zhang, Zhongyu Li, Xue Bin Peng, Bhuvan Basireddy, Linzhu Yue, Zhitao Song, Lizhi Yang, Yunhui Liu, Koushil Sreenath, Sergey Levine

TL;DR

GenLoco tackles the challenge of creating broadly applicable locomotion controllers for quadrupedal robots by training a single phase- and history-conditioned policy on a wide range of procedurally generated morphologies. The method randomizes morphology and dynamics in simulation and uses a simple feedforward policy that outputs target joint positions $\mathbf{q}^d \in \mathbb{R}^{12}$, which are added to time-invariant nominal poses and filtered before PD control, all learned with PPO at $30$ Hz. Key contributions include the GenLoco framework, a straightforward morphology randomization scheme, and extensive zero-shot real-world and out-of-distribution evaluations demonstrating transfer to unseen robots (e.g., A1, Mini Cheetah, Sirius) without robot-specific retraining. The work significantly reduces manual controller engineering for new quadrupedal platforms and suggests a path toward general-purpose robotic locomotion controllers, while noting limitations such as fixed DoFs and the potential benefits of more expressive architectures for broader generalization.

Abstract

Recent years have seen a surge in commercially-available and affordable quadrupedal robots, with many of these platforms being actively used in research and industry. As the availability of legged robots grows, so does the need for controllers that enable these robots to perform useful skills. However, most learning-based frameworks for controller development focus on training robot-specific controllers, a process that needs to be repeated for every new robot. In this work, we introduce a framework for training generalized locomotion (GenLoco) controllers for quadrupedal robots. Our framework synthesizes general-purpose locomotion controllers that can be deployed on a large variety of quadrupedal robots with similar morphologies. We present a simple but effective morphology randomization method that procedurally generates a diverse set of simulated robots for training. We show that by training a controller on this large set of simulated robots, our models acquire more general control strategies that can be directly transferred to novel simulated and real-world robots with diverse morphologies, which were not observed during training.

GenLoco: Generalized Locomotion Controllers for Quadrupedal Robots

TL;DR

, which are added to time-invariant nominal poses and filtered before PD control, all learned with PPO at

Hz. Key contributions include the GenLoco framework, a straightforward morphology randomization scheme, and extensive zero-shot real-world and out-of-distribution evaluations demonstrating transfer to unseen robots (e.g., A1, Mini Cheetah, Sirius) without robot-specific retraining. The work significantly reduces manual controller engineering for new quadrupedal platforms and suggests a path toward general-purpose robotic locomotion controllers, while noting limitations such as fixed DoFs and the potential benefits of more expressive architectures for broader generalization.

Abstract

Paper Structure (23 sections, 5 equations, 6 figures, 3 tables)

This paper contains 23 sections, 5 equations, 6 figures, 3 tables.

Introduction
Related Work
Generalized Quadrupedal Locomotion Controllers
Morphology Generation
Size factor.
Robot base parameters.
Leg parameters.
PD gains.
Training
State and action spaces.
Dynamics randomization.
Episode design.
Simulation Validation
Zero-Shot Transfer to Novel Robots
Out-of-Distribution Generalization
...and 8 more sections

Figures (6)

Figure 1: Testing of different simulated and real quadrupedal robots (A1, Mini Cheetah and Sirius) performing pacing gaits using a single locomotion controller. Our controllers can be directly deployed from simulation to the real world and across robots with different morphologies (e.g., body size, leg lengths, masses, etc.) and dynamics, without explicitly training on the specific robots used during testing.
Figure 2: The proposed generalized locomotion control framework for quadrupedal robots. GenLoco is designed to work on a large collection of robots with different morphology and dynamics. The input observations of the policy consist of a phase variable $\phi$ representing progression along a motion, a history of the robot's raw sensor feedback, and a history of past actions. The actions output by the controllers are added to time-invariant nominal joint positions and passed through a low-pass filter before being applied to joint-level PD controllers to generate motor torques.
Figure 3: Many common quadrupedal robots follow a common morphological template, consisting of a robot base (6 DoFs) and four 3-DoF legs. This design is followed in robots such as Unitree's A1, Go1, Aliengo, Laikago, MIT's Mini Cheetah, CUHK's Sirius, and Boston Dynamics' Spot. The robots highlighted with solid circles are the ones used to validate our system in the real world.
Figure 4: GenLoco policies deployed on a collection of quadrupedal robots that exist in real life. Two separate models are trained to perform a pacing gait and a spinning gait respectively. The GenLoco policies are trained in simulation using only procedurally generated robots, and robots used for testing are not used in the training process. The learned controllers can be directly deployed on all of these robots, including the ANYmal-series robots which have a distinct knee joint design, to perform agile maneuvers without additional training.
Figure 5: Benchmark of performance of GenLoco policies and policies trained specifically for A1 robot to perform pacing and spinning skills on a range of different robot morphologies and dynamics parameters in simulation. Green lines are the normalized return of the GenLoco policy (ours) while orange ones are those of the A1-specific policy. Returns are calculated by normalizing the cumulative reward (Appendix \ref{['app:reward']}) over the episode length. The red dashed lines indicate the training range while the black dashed lines denote the A1's morphological parameters. Note that for the varying morphology test the A1-specific policy is not trained with randomized morphology, and the dynamics randomization range is the same for all of these policies. Overall, the GenLoco policies outperform the A1-specific policies over different morphology and dynamics parameters. Furthermore, GenLoco is able to generalize over a larger range of morphologies and dynamics. Each testing episode lasts $100$ timesteps and returns are averaged across $10$ trials.
...and 1 more figures

GenLoco: Generalized Locomotion Controllers for Quadrupedal Robots

TL;DR

Abstract

GenLoco: Generalized Locomotion Controllers for Quadrupedal Robots

Authors

TL;DR

Abstract

Table of Contents

Figures (6)