MoE-Loco: Mixture of Experts for Multitask Locomotion
Runhan Huang, Shaoting Zhu, Yilun Du, Hang Zhao
TL;DR
MoE-Loco introduces a mixture-of-experts policy to enable a single robot policy to perform multitask locomotion across quadruped and biped gaits on diverse terrains. The approach uses a two-stage PPO framework with a shared gating network that routes to specialized experts in both actor and critic, aided by an estimator to enable proprioception-only deployment. Experiments in simulation and on real hardware reveal emergent expert specialization, improved performance on mixed terrains, and the ability to recombine skills by adjusting gating weights for new locomotion patterns. This modular, interpretable framework offers scalable multitask locomotion with task migration potential and robust sim-to-real transfer, laying groundwork for more capable legged robots.
Abstract
We present MoE-Loco, a Mixture of Experts (MoE) framework for multitask locomotion for legged robots. Our method enables a single policy to handle diverse terrains, including bars, pits, stairs, slopes, and baffles, while supporting quadrupedal and bipedal gaits. Using MoE, we mitigate the gradient conflicts that typically arise in multitask reinforcement learning, improving both training efficiency and performance. Our experiments demonstrate that different experts naturally specialize in distinct locomotion behaviors, which can be leveraged for task migration and skill composition. We further validate our approach in both simulation and real-world deployment, showcasing its robustness and adaptability.
