Synthesizing Physically Plausible Human Motions in 3D Scenes

Liang Pan; Jingbo Wang; Buzhen Huang; Junyu Zhang; Haofan Wang; Xu Tang; Yangang Wang

Synthesizing Physically Plausible Human Motions in 3D Scenes

Liang Pan, Jingbo Wang, Buzhen Huang, Junyu Zhang, Haofan Wang, Xu Tang, Yangang Wang

TL;DR

The paper addresses the challenge of generating physically plausible, long-term human motions in cluttered 3D scenes. It introduces InterScene, a physics-based framework that decouples interaction and navigation via InterCon and NavCon, trained with Adversarial Motion Priors and coordinated by a rule-based scheduler. Key contributions include two reusable controllers for interaction and navigation, training strategies such as seated pose sampling and interaction early termination, and validation on both single-object tasks and long-term multi-object scenes, including an extensibility demonstration with a lie-down skill. This approach enables scalable, realistic human-scene interactions in complex environments and provides practical potential for animation and simulation applications, with code and video available at the authors’ repository.

Abstract

We present a physics-based character control framework for synthesizing human-scene interactions. Recent advances adopt physics simulation to mitigate artifacts produced by data-driven kinematic approaches. However, existing physics-based methods mainly focus on single-object environments, resulting in limited applicability in realistic 3D scenes with multi-objects. To address such challenges, we propose a framework that enables physically simulated characters to perform long-term interaction tasks in diverse, cluttered, and unseen 3D scenes. The key idea is to decouple human-scene interactions into two fundamental processes, Interacting and Navigating, which motivates us to construct two reusable Controllers, namely InterCon and NavCon. Specifically, InterCon uses two complementary policies to enable characters to enter or leave the interacting state with a particular object (e.g., sitting on a chair or getting up). To realize navigation in cluttered environments, we introduce NavCon, where a trajectory following policy enables characters to track pre-planned collision-free paths. Benefiting from the divide and conquer strategy, we can train all policies in simple environments and directly apply them in complex multi-object scenes through coordination from a rule-based scheduler. Video and code are available at https://github.com/liangpan99/InterScene.

Synthesizing Physically Plausible Human Motions in 3D Scenes

TL;DR

Abstract

Paper Structure (14 sections, 7 equations, 10 figures, 4 tables)

This paper contains 14 sections, 7 equations, 10 figures, 4 tables.

Introduction
Related Work
Method
Preliminaries
System Overview
Interaction Controller
Navigation Controller
Experiment
Individual Tasks
Long-term Motion Synthesis
Limitations and Future Work
Conclusion
Implementation Details
Learning to Lie Down

Figures (10)

Figure 1: We propose InterScene, a novel method that generates physically plausible long-term motion sequences in 3D indoor scenes. Our approach enables physics-based characters to exhibit natural interaction-involved behaviors, such as sitting down (gray), getting up (blue), and walking while avoiding obstacles (pink).
Figure 2: System overview. Given a multi-object 3D scene, our goal is to synthesize long-term motion sequences by controlling a physics-based character to perform a series of scene interaction tasks. First, our system employs an interaction controller to provide two primary actions, i.e., sitting down and getting up. Second, we introduce a navigation controller to acquire another action, i.e., collision-free trajectory following. Finally, a rule-based action scheduler is exploited to obtain outputs by organizing reusable low-level actions according to user-designed instructions.
Figure 3: Seated poses sampled from the reference dataset (upper row) and generated by pre-trained sit policy (lower row).
Figure 4: The process for training the sit and get-up policies consists of three steps. 1) Sit Policy Training: Inspired by hassan2023synthesizing, we first extend the standard AMP framework with several improvements to train a robust sit policy. 2) Seated Pose Sampling: We tackle the issue of lacking high-quality seated human poses adapted to various object shapes by using the pre-trained sit policy to generate numerous seated poses randomly. 3) Get-up Policy Training: We adopt a similar method to train the get-up policy. At the beginning of each training episode, the character will be initialized to a seated state sampled from the previously synthesized database.
Figure 5: Performance curves of sit policies trained with various early termination settings. Colored regions denote the fluctuation range over 3 models.
...and 5 more figures

Synthesizing Physically Plausible Human Motions in 3D Scenes

TL;DR

Abstract

Synthesizing Physically Plausible Human Motions in 3D Scenes

Authors

TL;DR

Abstract

Table of Contents

Figures (10)