Table of Contents
Fetching ...

It Takes Two: Learning Interactive Whole-Body Control Between Humanoid Robots

Zuhong Liu, Junhao Ge, Minhao Xiong, Jiahao Gu, Bowei Tang, Wei Jing, Siheng Chen

TL;DR

The paper tackles dual-humanoid motion imitation, addressing the isolation issue in single-humanoid approaches. It introduces Harmanoid, a two-stage framework combining contact-aware motion retargeting with an interaction-driven motion controller to preserve interaction geometry and enforce physically plausible, synchronized behaviors. Experimental results on the Inter-X dataset show improved success rates, reduced interpenetrations, and more coherent coordination than single-humanoid baselines; ablations demonstrate the contributions of contact and interaction rewards plus curriculum. The work advances practical dual-robot collaboration and provides a pathway toward real-world deployment with perception and communication integration.

Abstract

The true promise of humanoid robotics lies beyond single-agent autonomy: two or more humanoids must engage in physically grounded, socially meaningful whole-body interactions that echo the richness of human social interaction. However, single-humanoid methods suffer from the isolation issue, ignoring inter-agent dynamics and causing misaligned contacts, interpenetrations, and unrealistic motions. To address this, we present Harmanoid , a dual-humanoid motion imitation framework that transfers interacting human motions to two robots while preserving both kinematic fidelity and physical realism. Harmanoid comprises two key components: (i) contact-aware motion retargeting, which restores inter-body coordination by aligning SMPL contacts with robot vertices, and (ii) interaction-driven motion controller, which leverages interaction-specific rewards to enforce coordinated keypoints and physically plausible contacts. By explicitly modeling inter-agent contacts and interaction-aware dynamics, Harmanoid captures the coupled behaviors between humanoids that single-humanoid frameworks inherently overlook. Experiments demonstrate that Harmanoid significantly improves interactive motion imitation, surpassing existing single-humanoid frameworks that largely fail in such scenarios.

It Takes Two: Learning Interactive Whole-Body Control Between Humanoid Robots

TL;DR

The paper tackles dual-humanoid motion imitation, addressing the isolation issue in single-humanoid approaches. It introduces Harmanoid, a two-stage framework combining contact-aware motion retargeting with an interaction-driven motion controller to preserve interaction geometry and enforce physically plausible, synchronized behaviors. Experimental results on the Inter-X dataset show improved success rates, reduced interpenetrations, and more coherent coordination than single-humanoid baselines; ablations demonstrate the contributions of contact and interaction rewards plus curriculum. The work advances practical dual-robot collaboration and provides a pathway toward real-world deployment with perception and communication integration.

Abstract

The true promise of humanoid robotics lies beyond single-agent autonomy: two or more humanoids must engage in physically grounded, socially meaningful whole-body interactions that echo the richness of human social interaction. However, single-humanoid methods suffer from the isolation issue, ignoring inter-agent dynamics and causing misaligned contacts, interpenetrations, and unrealistic motions. To address this, we present Harmanoid , a dual-humanoid motion imitation framework that transfers interacting human motions to two robots while preserving both kinematic fidelity and physical realism. Harmanoid comprises two key components: (i) contact-aware motion retargeting, which restores inter-body coordination by aligning SMPL contacts with robot vertices, and (ii) interaction-driven motion controller, which leverages interaction-specific rewards to enforce coordinated keypoints and physically plausible contacts. By explicitly modeling inter-agent contacts and interaction-aware dynamics, Harmanoid captures the coupled behaviors between humanoids that single-humanoid frameworks inherently overlook. Experiments demonstrate that Harmanoid significantly improves interactive motion imitation, surpassing existing single-humanoid frameworks that largely fail in such scenarios.

Paper Structure

This paper contains 18 sections, 23 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: In dual-humanoid scenarios, the isolation issue of single-humanoid motion imitation causes interpenetrations during motion retargeting, leading to instability and unrealistic behaviors in motion controller training.
  • Figure 2: Overview of the proposed Harmanoid framework. It contains two key components: (i) contact-aware motion retargeting, which optimize inter-body distance by aligning SMPL contacts with robot vertices, and (ii) interaction-driven motion controller, which leverages interaction-specific rewards to enforce coordinated keypoints and physically plausible contacts.
  • Figure 3: Overview of our motion retargeting pipeline that maps SMPL contacts to the robot mesh and optimizes root pose to ensure realistic, penetration-free motions.
  • Figure 4: Comparison showing the effect of $\beta$ regularization during shape optimization, which keeps the human shape close to the robot structure.
  • Figure 5: Qualitative comparison between Inter-X mocap data, single-PHC retargeting, and our contact-aware retargeting algorithm. The red and green circles highlight regions of penetration/collision and successful avoidance, respectively. Our method better preserves interaction characteristics of the reference motion while reducing collisions.
  • ...and 2 more figures