Table of Contents
Fetching ...

MoE-Loco: Mixture of Experts for Multitask Locomotion

Runhan Huang, Shaoting Zhu, Yilun Du, Hang Zhao

TL;DR

MoE-Loco introduces a mixture-of-experts policy to enable a single robot policy to perform multitask locomotion across quadruped and biped gaits on diverse terrains. The approach uses a two-stage PPO framework with a shared gating network that routes to specialized experts in both actor and critic, aided by an estimator to enable proprioception-only deployment. Experiments in simulation and on real hardware reveal emergent expert specialization, improved performance on mixed terrains, and the ability to recombine skills by adjusting gating weights for new locomotion patterns. This modular, interpretable framework offers scalable multitask locomotion with task migration potential and robust sim-to-real transfer, laying groundwork for more capable legged robots.

Abstract

We present MoE-Loco, a Mixture of Experts (MoE) framework for multitask locomotion for legged robots. Our method enables a single policy to handle diverse terrains, including bars, pits, stairs, slopes, and baffles, while supporting quadrupedal and bipedal gaits. Using MoE, we mitigate the gradient conflicts that typically arise in multitask reinforcement learning, improving both training efficiency and performance. Our experiments demonstrate that different experts naturally specialize in distinct locomotion behaviors, which can be leveraged for task migration and skill composition. We further validate our approach in both simulation and real-world deployment, showcasing its robustness and adaptability.

MoE-Loco: Mixture of Experts for Multitask Locomotion

TL;DR

MoE-Loco introduces a mixture-of-experts policy to enable a single robot policy to perform multitask locomotion across quadruped and biped gaits on diverse terrains. The approach uses a two-stage PPO framework with a shared gating network that routes to specialized experts in both actor and critic, aided by an estimator to enable proprioception-only deployment. Experiments in simulation and on real hardware reveal emergent expert specialization, improved performance on mixed terrains, and the ability to recombine skills by adjusting gating weights for new locomotion patterns. This modular, interpretable framework offers scalable multitask locomotion with task migration potential and robust sim-to-real transfer, laying groundwork for more capable legged robots.

Abstract

We present MoE-Loco, a Mixture of Experts (MoE) framework for multitask locomotion for legged robots. Our method enables a single policy to handle diverse terrains, including bars, pits, stairs, slopes, and baffles, while supporting quadrupedal and bipedal gaits. Using MoE, we mitigate the gradient conflicts that typically arise in multitask reinforcement learning, improving both training efficiency and performance. Our experiments demonstrate that different experts naturally specialize in distinct locomotion behaviors, which can be leveraged for task migration and skill composition. We further validate our approach in both simulation and real-world deployment, showcasing its robustness and adaptability.

Paper Structure

This paper contains 25 sections, 5 equations, 9 figures, 9 tables, 2 algorithms.

Figures (9)

  • Figure 2: A snapshot of the terrain settings. From left to right: bar, pit, baffle, slope, and stairs.
  • Figure 3: Overview of our MoELoco pipeline. With the design of MoE architecture, our policy achieves robust multitask locomotion ability on various challenging terrains with multiple gaits.
  • Figure 4: Real world success rate over multiple terrains and gaits.
  • Figure 5: Real-world experiments over multiple terrains and gaits: 1. Bar (Quad), 2. Pit (Quad), 3. Baffle (Quad), 4. Stair (Quad), 5. Slope (Quad), 6. Stand up (Bip), 7. Walk (Bip), 8. Slope (Bip), 9. Stair (Bip).
  • Figure 6: Training curve of our multitask policy in the pretraining stage.
  • ...and 4 more figures