Table of Contents
Fetching ...

HAFO: A Force-Adaptive Control Framework for Humanoid Robots in Intense Interaction Environments

Chenhui Dong, Haozhe Xu, Wenhao Feng, Zhipeng Wang, Yanmin Zhou, Yifei Zhao, Bin He

TL;DR

The paper tackles the challenge of robust, forceful interaction in humanoid control by introducing HAFO, a dual-agent reinforcement learning framework that decouples lower-body locomotion from upper-body manipulation and trains under explicit disturbances modeled with a virtual spring-damper. It employs an asymmetric actor-critic where the critic accesses privileged force information to guide the learner toward generalizable force adaptation, and uses curriculum learning to progressively expose the policy to stronger disturbances, including rope-suspension scenarios. Key contributions include the dual-agent architecture, explicit disturbance modeling with the spring-damper system, and extensive simulation and real-world validation demonstrating improved stability and precision under heavy loads, thrust disturbances, and suspension tasks. The approach shows strong potential for practical loco-manipulation tasks in challenging environments and scales to larger humanoid platforms with robust force-resilience traits.

Abstract

Reinforcement learning (RL) controllers have made impressive progress in humanoid locomotion and light-weight object manipulation. However, achieving robust and precise motion control with intense force interaction remains a significant challenge. To address these limitations, this paper proposes HAFO, a dual-agent reinforcement learning framework that concurrently optimizes both a robust locomotion strategy and a precise upper-body manipulation strategy via coupled training in environments with external disturbances. The external pulling disturbances are explicitly modeled using a spring-damper system, allowing for fine-grained force control through manipulation of the virtual spring. In this process, the reinforcement learning policy autonomously generates a disturbance-rejection response by utilizing environmental feedback. Furthermore, HAFO employs an asymmetric Actor-Critic framework in which the Critic network's access to privileged external forces guides the actor network to acquire generalizable force adaptation for resisting external disturbances. The experimental results demonstrate that HAFO achieves whole-body control for humanoid robots across diverse force-interaction environments, delivering outstanding performance in load-bearing tasks and maintaining stable operation even under rope suspension state.

HAFO: A Force-Adaptive Control Framework for Humanoid Robots in Intense Interaction Environments

TL;DR

The paper tackles the challenge of robust, forceful interaction in humanoid control by introducing HAFO, a dual-agent reinforcement learning framework that decouples lower-body locomotion from upper-body manipulation and trains under explicit disturbances modeled with a virtual spring-damper. It employs an asymmetric actor-critic where the critic accesses privileged force information to guide the learner toward generalizable force adaptation, and uses curriculum learning to progressively expose the policy to stronger disturbances, including rope-suspension scenarios. Key contributions include the dual-agent architecture, explicit disturbance modeling with the spring-damper system, and extensive simulation and real-world validation demonstrating improved stability and precision under heavy loads, thrust disturbances, and suspension tasks. The approach shows strong potential for practical loco-manipulation tasks in challenging environments and scales to larger humanoid platforms with robust force-resilience traits.

Abstract

Reinforcement learning (RL) controllers have made impressive progress in humanoid locomotion and light-weight object manipulation. However, achieving robust and precise motion control with intense force interaction remains a significant challenge. To address these limitations, this paper proposes HAFO, a dual-agent reinforcement learning framework that concurrently optimizes both a robust locomotion strategy and a precise upper-body manipulation strategy via coupled training in environments with external disturbances. The external pulling disturbances are explicitly modeled using a spring-damper system, allowing for fine-grained force control through manipulation of the virtual spring. In this process, the reinforcement learning policy autonomously generates a disturbance-rejection response by utilizing environmental feedback. Furthermore, HAFO employs an asymmetric Actor-Critic framework in which the Critic network's access to privileged external forces guides the actor network to acquire generalizable force adaptation for resisting external disturbances. The experimental results demonstrate that HAFO achieves whole-body control for humanoid robots across diverse force-interaction environments, delivering outstanding performance in load-bearing tasks and maintaining stable operation even under rope suspension state.

Paper Structure

This paper contains 22 sections, 5 equations, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Overview of the HAFO model. (a) Policy Training. A dual-agent strategy with decoupled upper and lower bodies is adopted, where the lower-body policy takes root linear and angular velocities as command inputs, and the upper-body policy uses reference joint trajectories as command inputs. Additionally, various explicit dynamic perturbations are introduced at key locations to enhance the system's robustness and adaptability. (b) Strategy Deployment. A humanoid robot control system based on teleoperation is developed, employing an efficient inverse kinematics algorithm to compute the robot's joint angles in real time with high precision, enabling efficient loco-manipulation tasks.
  • Figure 1: Multi-scenario validation of the HAFO policy on the unitree H1-2 humanoid robot. (a) Robot upper-body swing. (b) Punching and maintaining stable standing. (c) Stable locomotion with a 100 N external force applied to each hand. (d) Stable control in the suspended state. (e) Rapid robot adjustment after external impact.
  • Figure 2: Spring-damper model and performance analysis. (a) Spring-damper model schematic on the humanoid robot. (b)Pelvis position, spring force, and ground-reaction force versus a specified $\mathbf{X}_{\mathrm{des}}^{\text{pelvis }}$ displacement for Spring-damper and stiffness models.
  • Figure 3: Policy performance under various external thrust disturbances. (a) Performance under continuous constant-force disturbances applied from multiple directions. (b) Performance under one-second transient force disturbances from multiple directions.
  • Figure 4: HAFO enables force-adaptive control for humanoid robots across multiple scenarios. (a) Whole-body control with sandbag loads. (b) Locomotion while carrying a box with one hand. (c) Being lifted onto an elevated platform and resuming locomotion. (d) Locomotion while carrying flower pots with both hands. (e) Policy deployment in a suspended state. (f) Performing dance movements. (g) Window cleaning while suspended by ropes.