Table of Contents
Fetching ...

Heuristic Step Planning for Learning Dynamic Bipedal Locomotion: A Comparative Study of Model-Based and Model-Free Approaches

William Suliman, Ekaterina Chaikovskaia, Egor Davydenko, Roman Gorbachev

TL;DR

The paper tackles robust, energy-efficient bipedal locomotion in unstructured environments by integrating a simple heuristic Linear Step Planning (LS) with a Raibert-like velocity regulator into a reinforcement learning framework. It compares LS against a model-based LIPM baseline, showing LS achieves comparable or superior velocity tracking, better energy efficiency, and greater robustness on uneven terrains and under disturbances. The results suggest that complex analytical models may be unnecessary for effective, generalizable learning-based bipedal control, at least in simulated environments. This approach enhances modularity and computational efficiency, enabling reliable interaction with challenging terrains without heavy reliance on full dynamics modeling.

Abstract

This work presents an extended framework for learning-based bipedal locomotion that incorporates a heuristic step-planning strategy guided by desired torso velocity tracking. The framework enables precise interaction between a humanoid robot and its environment, supporting tasks such as crossing gaps and accurately approaching target objects. Unlike approaches based on full or simplified dynamics, the proposed method avoids complex step planners and analytical models. Step planning is primarily driven by heuristic commands, while a Raibert-type controller modulates the foot placement length based on the error between desired and actual torso velocity. We compare our method with a model-based step-planning approach -- the Linear Inverted Pendulum Model (LIPM) controller. Experimental results demonstrate that our approach attains comparable or superior accuracy in maintaining target velocity (up to 80%), significantly greater robustness on uneven terrain (over 50% improvement), and improved energy efficiency. These results suggest that incorporating complex analytical, model-based components into the training architecture may be unnecessary for achieving stable and robust bipedal walking, even in unstructured environments.

Heuristic Step Planning for Learning Dynamic Bipedal Locomotion: A Comparative Study of Model-Based and Model-Free Approaches

TL;DR

The paper tackles robust, energy-efficient bipedal locomotion in unstructured environments by integrating a simple heuristic Linear Step Planning (LS) with a Raibert-like velocity regulator into a reinforcement learning framework. It compares LS against a model-based LIPM baseline, showing LS achieves comparable or superior velocity tracking, better energy efficiency, and greater robustness on uneven terrains and under disturbances. The results suggest that complex analytical models may be unnecessary for effective, generalizable learning-based bipedal control, at least in simulated environments. This approach enhances modularity and computational efficiency, enabling reliable interaction with challenging terrains without heavy reliance on full dynamics modeling.

Abstract

This work presents an extended framework for learning-based bipedal locomotion that incorporates a heuristic step-planning strategy guided by desired torso velocity tracking. The framework enables precise interaction between a humanoid robot and its environment, supporting tasks such as crossing gaps and accurately approaching target objects. Unlike approaches based on full or simplified dynamics, the proposed method avoids complex step planners and analytical models. Step planning is primarily driven by heuristic commands, while a Raibert-type controller modulates the foot placement length based on the error between desired and actual torso velocity. We compare our method with a model-based step-planning approach -- the Linear Inverted Pendulum Model (LIPM) controller. Experimental results demonstrate that our approach attains comparable or superior accuracy in maintaining target velocity (up to 80%), significantly greater robustness on uneven terrain (over 50% improvement), and improved energy efficiency. These results suggest that incorporating complex analytical, model-based components into the training architecture may be unnecessary for achieving stable and robust bipedal walking, even in unstructured environments.

Paper Structure

This paper contains 16 sections, 3 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: An overview of the structure and flow of the proposed learning-based framework. The proposed method employs a heuristic step planner to determine step locations, incorporating Raibert-like regulators to accurately track the desired velocity.
  • Figure 2: BRUCE and its general morphology.
  • Figure 3: Time series motion tiles for simulated walking using the LS-based approach in the PyBullet environment at $v_x = 0.6 \, \text{m/s}$ and step duration 0.25s. Red area: right step position command, blue area: left step position command.
  • Figure 4: Velocity tracking performance on flat terrain for the probosed method at a step duration of 0.25s. The LS-based method outperforms the LIPM-based approach.
  • Figure 5: Robot’s response to a push at zero commanded velocity. The LS-based method was able to handle more pushes, with a specific force applied, than the other method, demonstrating better stability under disturbances. The total number of tested points is 500. Below each subplot, the total number of recovered and fallen points is indicated.
  • ...and 3 more figures