Integrating Model-Based Footstep Planning with Model-Free Reinforcement Learning for Dynamic Legged Locomotion
Ho Jae Lee, Seungwoo Hong, Sangbae Kim
TL;DR
The paper tackles robust dynamic legged locomotion by fusing a physics-based step planner grounded in the $3$D-LIPM with a model-free PPO policy. The planner generates target foot placements via ICP trajectories from velocity commands, while the RL policy learns to track these placements and maintain balance, enabling exploration beyond the simplified model. On the MIT Humanoid, the approach achieves stable forward walking up to $1.5$ m/s and performs dynamic turning, with demonstrated generalization to unseen rough and gap terrains and successful sim-to-real transfer. This method offers improved velocity tracking and adaptability by leveraging physics-informed guidance without overfitting to a template model, indicating practical potential for real-world legged locomotion.
Abstract
In this work, we introduce a control framework that combines model-based footstep planning with Reinforcement Learning (RL), leveraging desired footstep patterns derived from the Linear Inverted Pendulum (LIP) dynamics. Utilizing the LIP model, our method forward predicts robot states and determines the desired foot placement given the velocity commands. We then train an RL policy to track the foot placements without following the full reference motions derived from the LIP model. This partial guidance from the physics model allows the RL policy to integrate the predictive capabilities of the physics-informed dynamics and the adaptability characteristics of the RL controller without overfitting the policy to the template model. Our approach is validated on the MIT Humanoid, demonstrating that our policy can achieve stable yet dynamic locomotion for walking and turning. We further validate the adaptability and generalizability of our policy by extending the locomotion task to unseen, uneven terrain. During the hardware deployment, we have achieved forward walking speeds of up to 1.5 m/s on a treadmill and have successfully performed dynamic locomotion maneuvers such as 90-degree and 180-degree turns.
