Table of Contents
Fetching ...

Enhancing Model-Based Step Adaptation for Push Recovery through Reinforcement Learning of Step Timing and Region

Tobias Egle, Yashuai Yan, Dongheui Lee, Christian Ott

TL;DR

This paper introduces a new approach to enhance the robustness of humanoid walking under strong perturbations, such as substantial pushes, that leverages reinforcement learning to dynamically adjust the permissible footstep region, expanding it to a larger, effectively non-convex area and allowing cross-over stepping, which is crucial for counteracting large lateral pushes.

Abstract

This paper introduces a new approach to enhance the robustness of humanoid walking under strong perturbations, such as substantial pushes. Effective recovery from external disturbances requires bipedal robots to dynamically adjust their stepping strategies, including footstep positions and timing. Unlike most advanced walking controllers that restrict footstep locations to a predefined convex region, substantially limiting recoverable disturbances, our method leverages reinforcement learning to dynamically adjust the permissible footstep region, expanding it to a larger, effectively non-convex area and allowing cross-over stepping, which is crucial for counteracting large lateral pushes. Additionally, our method adapts footstep timing in real time to further extend the range of recoverable disturbances. Based on these adjustments, feasible footstep positions and DCM trajectory are planned by solving a QP. Finally, we employ a DCM controller and an inverse dynamics whole-body control framework to ensure the robot effectively follows the trajectory.

Enhancing Model-Based Step Adaptation for Push Recovery through Reinforcement Learning of Step Timing and Region

TL;DR

This paper introduces a new approach to enhance the robustness of humanoid walking under strong perturbations, such as substantial pushes, that leverages reinforcement learning to dynamically adjust the permissible footstep region, expanding it to a larger, effectively non-convex area and allowing cross-over stepping, which is crucial for counteracting large lateral pushes.

Abstract

This paper introduces a new approach to enhance the robustness of humanoid walking under strong perturbations, such as substantial pushes. Effective recovery from external disturbances requires bipedal robots to dynamically adjust their stepping strategies, including footstep positions and timing. Unlike most advanced walking controllers that restrict footstep locations to a predefined convex region, substantially limiting recoverable disturbances, our method leverages reinforcement learning to dynamically adjust the permissible footstep region, expanding it to a larger, effectively non-convex area and allowing cross-over stepping, which is crucial for counteracting large lateral pushes. Additionally, our method adapts footstep timing in real time to further extend the range of recoverable disturbances. Based on these adjustments, feasible footstep positions and DCM trajectory are planned by solving a QP. Finally, we employ a DCM controller and an inverse dynamics whole-body control framework to ensure the robot effectively follows the trajectory.

Paper Structure

This paper contains 23 sections, 26 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Simulation of the robot kangaroo during a lateral push force of 800N for 0.1 seconds. Our method allows the robot to quickly recover from such a large disturbance through a leg cross-over and simultaneous adjustment in step timing.
  • Figure 2: Overview of our framework. The main contribution is extending the model-based control framework by an RL-based step timing and region adaptation. An inverse dynamics whole-body controller generates the desired joint torques. Model-based time adaptation is only used in the baseline method.
  • Figure 3: Graphical illustration of the computation of the DCM start point as input to the optimization problem. Comparison of model-based step timing adaptation and improvement due to possible adjustment of the footstep region by the RL agent.
  • Figure 4: Step region parametrization. The extended footstep region is parametrized by a convex region (here $P_{\mathrm{step},2}$ for single support) and a rotation angle $\theta$ around the stance foot.
  • Figure 5: Push force evaluation. We conduct 1000 experiments to push the robot with random forces and in random directions. The SVM with RBF kernel is employed to obtain the maximal recoverable disturbance contour between successful and failed trials.
  • ...and 3 more figures