TeamHOI: Learning a Unified Policy for Cooperative Human-Object Interactions with Any Team Size

Stefan Lionar; Gim Hee Lee

TeamHOI: Learning a Unified Policy for Cooperative Human-Object Interactions with Any Team Size

Stefan Lionar, Gim Hee Lee

TL;DR

This work presents TeamHOI, a framework that enables a single decentralized policy to handle cooperative HOIs across any number of cooperating agents, and introduces a masked Adversarial Motion Prior (AMP) strategy that uses single-human reference motions while masking object-interacting body parts during training.

Abstract

Physics-based humanoid control has achieved remarkable progress in enabling realistic and high-performing single-agent behaviors, yet extending these capabilities to cooperative human-object interaction (HOI) remains challenging. We present TeamHOI, a framework that enables a single decentralized policy to handle cooperative HOIs across any number of cooperating agents. Each agent operates using local observations while attending to other teammates through a Transformer-based policy network with teammate tokens, allowing scalable coordination across variable team sizes. To enforce motion realism while addressing the scarcity of cooperative HOI data, we further introduce a masked Adversarial Motion Prior (AMP) strategy that uses single-human reference motions while masking object-interacting body parts during training. The masked regions are then guided through task rewards to produce diverse and physically plausible cooperative behaviors. We evaluate TeamHOI on a challenging cooperative carrying task involving two to eight humanoid agents and varied object geometries. Finally, to promote stable carrying, we design a team-size- and shape-agnostic formation reward. TeamHOI achieves high success rates and demonstrates coherent cooperation across diverse configurations with a single policy.

TeamHOI: Learning a Unified Policy for Cooperative Human-Object Interactions with Any Team Size

TL;DR

Abstract

Paper Structure (31 sections, 30 equations, 10 figures, 4 tables)

This paper contains 31 sections, 30 equations, 10 figures, 4 tables.

Introduction
Related Work
Physics-based Human-Scene Interaction
Multi-Humanoid Interaction and Cooperation
Methodology
Preliminary
TeamHOI Framework
Cooperative Carrying Task
Formation Reward
Experiment
Implementation Details
Evaluation
Ablation Study
Conclusion
Training with Various Team Sizes
...and 16 more sections

Figures (10)

Figure 1: We present TeamHOI, a framework for learning a unified decentralized policy for cooperative human-object interactions (HOI) across varying team sizes and object configurations. Our framework enables effective cooperation where each humanoid acts independently from local observations while coordinating with others through a single shared policy. Video demonstrations are provided on our https://splionar.github.io/TeamHOI.
Figure 2: Overview of TeamHOI framework. A transformer-based policy network enables coordination between the observing agent (green humanoid) and its teammates (grey humanoids) through alternating self- and cross-attention layers. By training across diverse team-size environments, the framework learns a unified policy that works across different team configurations. To maintain motion realism and enhance skill diversity, a masked AMP strategy blends full-body and masked discriminators based on object interaction.
Figure 3: Illustration of our principal-axes coverage reward.
Figure 4: Qualitative comparison across 4-agent (top) and 8-agent (bottom) configurations. Our method produces synchronized and stable teamwork across both cases, whereas the CooHOI* baselines exhibit limited or ineffective cooperation. Red line indicates the table’s movement trajectory, and the black dot marks its final position at the end of each episode.
Figure 5: Ablation on the masked AMP strategy. Comparison between models trained with and without masked AMP, showing improved task rewards and successful hand-object interactions when masking is applied.
...and 5 more figures

TeamHOI: Learning a Unified Policy for Cooperative Human-Object Interactions with Any Team Size

TL;DR

Abstract

TeamHOI: Learning a Unified Policy for Cooperative Human-Object Interactions with Any Team Size

Authors

TL;DR

Abstract

Table of Contents

Figures (10)