3D Human Interaction Generation: A Survey
Siyuan Fan, Wenke Huang, Xiantao Cai, Bo Du
TL;DR
3D Human Interaction Generation: A Survey organizes the rapidly evolving literature into three interaction paradigms—HSI, HOI, and HHI—grounding the discussion in foundational technologies (3D representations, motion capture, generative models). It provides a comprehensive taxonomy of methods, datasets, and evaluation metrics, highlighting diffusion-based and text/scene-conditioned approaches as current frontrunners. The survey identifies critical challenges, including data scarcity, physical plausibility, and controllability, and outlines future directions such as large-scale real-world data collection, multi-person and non-rigid object interactions, and language-guided generation. Overall, the work serves as a central reference for researchers and practitioners aiming to advance realistic, context-aware 3D interaction synthesis across diverse application domains.
Abstract
3D human interaction generation has emerged as a key research area, focusing on producing dynamic and contextually relevant interactions between humans and various interactive entities. Recent rapid advancements in 3D model representation methods, motion capture technologies, and generative models have laid a solid foundation for the growing interest in this domain. Existing research in this field can be broadly categorized into three areas: human-scene interaction, human-object interaction, and human-human interaction. Despite the rapid advancements in this area, challenges remain due to the need for naturalness in human motion generation and the accurate interaction between humans and interactive entities. In this survey, we present a comprehensive literature review of human interaction generation, which, to the best of our knowledge, is the first of its kind. We begin by introducing the foundational technologies, including model representations, motion capture methods, and generative models. Subsequently, we introduce the approaches proposed for the three sub-tasks, along with their corresponding datasets and evaluation metrics. Finally, we discuss potential future research directions in this area and conclude the survey. Through this survey, we aim to offer a comprehensive overview of the current advancements in the field, highlight key challenges, and inspire future research works.
