Table of Contents
Fetching ...

iWalker: Imperative Visual Planning for Walking Humanoid Robot

Xiao Lin, Yuhao Huang, Taimeng Fu, Xiaobin Xiong, Chen Wang

TL;DR

The paper tackles robust humanoid walking in human-centric environments where modular sensing-planning-control pipelines suffer from error accumulation. It introduces iWalker, an end-to-end vision-to-control system with two imperative-learning BLOs: iPath for dynamics-aware path planning and iStepper for dynamics-aware stepping, trained via MPC losses and collision-aware ESDF maps. Formally, the upper-level losses are $U_{path}=\mathrm{MSE}(\hat{\phi},\phi^*)$ with $\phi^*=\arg\min_{\phi} L_{path}(\hat{\phi},\phi)$ and $U_{step}=\mathrm{MSE}(\hat{S},S^*)+U_w+U_l+U_{ESDF}$, where $S^*=\arg\min_{S} L_{step}(\hat{S},S)$, and the lower-level models enforce unicycle dynamics $x_{k+1}=x_k+v_k\cos{\theta_k}\,dt$, $y_{k+1}=y_k+v_k\sin{\theta_k}\,dt$, $v_{k+1}=v_k+a_k\,dt$, $\theta_{k+1}=\theta_k+\omega_k\,dt$. Experiments in simulation and on a real BRUCE robot show improved dynamical feasibility, obstacle avoidance, and robustness to unseen environments, highlighting the practical impact of end-to-end learning for humanoid locomotion with reduced reliance on large handcrafted maps. The framework advances autonomous walking by tightly integrating vision, dynamics, and optimization through differentiable BLOs and MPCs, with demonstrated generalization to real and simulated settings.

Abstract

Humanoid robots, designed to operate in human-centric environments, serve as a fundamental platform for a broad range of tasks. Although humanoid robots have been extensively studied for decades, a majority of existing humanoid robots still heavily rely on complex modular frameworks, leading to inflexibility and potential compounded errors from independent sensing, planning, and acting components. In response, we propose an end-to-end humanoid sense-plan-act walking system, enabling vision-based obstacle avoidance and footstep planning for whole body balancing simultaneously. We designed two imperative learning (IL)-based bilevel optimizations for model-predictive step planning and whole body balancing, respectively, to achieve self-supervised learning for humanoid robot walking. This enables the robot to learn from arbitrary unlabeled data, improving its adaptability and generalization capabilities. We refer to our method as iWalker and demonstrate its effectiveness in both simulated and real-world environments, representing a significant advancement toward autonomous humanoid robots.

iWalker: Imperative Visual Planning for Walking Humanoid Robot

TL;DR

The paper tackles robust humanoid walking in human-centric environments where modular sensing-planning-control pipelines suffer from error accumulation. It introduces iWalker, an end-to-end vision-to-control system with two imperative-learning BLOs: iPath for dynamics-aware path planning and iStepper for dynamics-aware stepping, trained via MPC losses and collision-aware ESDF maps. Formally, the upper-level losses are with and , where , and the lower-level models enforce unicycle dynamics , , , . Experiments in simulation and on a real BRUCE robot show improved dynamical feasibility, obstacle avoidance, and robustness to unseen environments, highlighting the practical impact of end-to-end learning for humanoid locomotion with reduced reliance on large handcrafted maps. The framework advances autonomous walking by tightly integrating vision, dynamics, and optimization through differentiable BLOs and MPCs, with demonstrated generalization to real and simulated settings.

Abstract

Humanoid robots, designed to operate in human-centric environments, serve as a fundamental platform for a broad range of tasks. Although humanoid robots have been extensively studied for decades, a majority of existing humanoid robots still heavily rely on complex modular frameworks, leading to inflexibility and potential compounded errors from independent sensing, planning, and acting components. In response, we propose an end-to-end humanoid sense-plan-act walking system, enabling vision-based obstacle avoidance and footstep planning for whole body balancing simultaneously. We designed two imperative learning (IL)-based bilevel optimizations for model-predictive step planning and whole body balancing, respectively, to achieve self-supervised learning for humanoid robot walking. This enables the robot to learn from arbitrary unlabeled data, improving its adaptability and generalization capabilities. We refer to our method as iWalker and demonstrate its effectiveness in both simulated and real-world environments, representing a significant advancement toward autonomous humanoid robots.
Paper Structure (22 sections, 9 equations, 5 figures, 1 table, 1 algorithm)

This paper contains 22 sections, 9 equations, 5 figures, 1 table, 1 algorithm.

Figures (5)

  • Figure 1: A humanoid robot navigates autonomously through a messy office using iWalker. (a) is the input depth image; (b) is a visualization of the projected collision map; and (c) is a visualization of the planned path and footsteps.
  • Figure 2: The system pipeline of our humanoid robot, where iWalker is a walking planner for mid-level footstep control.
  • Figure 3: iWalker includes two bi-level optimizations (BLO) based on imperative learning (IL). The first BLO optimizes iPath to predict a dynamically optimal and collision-safe path, while the second BLO optimizes iStepper to predict physically feasible footsteps for low-level controllers. iPath uses the same network structure as iPlanner, while iStepper is a simple three-layer MLP model.
  • Figure 4: A visualization of collision map and gradient. The 3D point cloud (left) is obtained from depth projection, while the right is an approximated ESDF map from blurring.
  • Figure 5: We presents both simulation and real-world demonstrations of our robot's navigation capabilities across distinct environments. Goal points are explicitly labeled. In Scene 1, the robot with iWalker trained on real-world data navigates in a factory-like scene using only three goal points, showcasing efficient navigation in unseen environments. In Scene 2, the robot traverses a 10-meter corridor with many turns, showing its proficiency in handling long pathways. In Scene 3, iWalker navigates a short path through environments with irregular shapes, illustrating its adaptability to noisy depth.