CORE4D: A 4D Human-Object-Human Interaction Dataset for Collaborative Object REarrangement
Yun Liu, Chengwen Zhang, Ruofan Xing, Bingda Tang, Bowen Yang, Li Yi
TL;DR
CORE4D tackles the scarcity of large-scale, high-fidelity 4D human-object-human interaction data for collaborative object rearrangement by combining 1K real HOH sequences with 10K synthetic retargeted sequences across 3K object shapes. It introduces a hybrid data-collection pipeline that preserves temporal collaboration patterns while expanding spatial relations through collaboration retargeting, including object-centric DeepSDF-based retargeting and a human-centric pose discriminator for selection. The work benchmarks two tasks—motion forecasting and interaction synthesis—demonstrating the challenges of modeling multi-person collaboration and showing that synthetic data can enhance forecasting performance and enable humanoid skill learning. By providing diverse object geometries, collaboration modes, and 3D scenes, CORE4D offers a practical resource for VR/AR, human-robot interaction, and humanoid manipulation research, while acknowledging limitations like the absence of outdoor scenes and visual data in the synthetic branch.
Abstract
Understanding how humans cooperatively rearrange household objects is critical for VR/AR and human-robot interaction. However, in-depth studies on modeling these behaviors are under-researched due to the lack of relevant datasets. We fill this gap by presenting CORE4D, a novel large-scale 4D human-object-human interaction dataset focusing on collaborative object rearrangement, which encompasses diverse compositions of various object geometries, collaboration modes, and 3D scenes. With 1K human-object-human motion sequences captured in the real world, we enrich CORE4D by contributing an iterative collaboration retargeting strategy to augment motions to a variety of novel objects. Leveraging this approach, CORE4D comprises a total of 11K collaboration sequences spanning 3K real and virtual object shapes. Benefiting from extensive motion patterns provided by CORE4D, we benchmark two tasks aiming at generating human-object interaction: human-object motion forecasting and interaction synthesis. Extensive experiments demonstrate the effectiveness of our collaboration retargeting strategy and indicate that CORE4D has posed new challenges to existing human-object interaction generation methodologies.
