Placing Human Animations into 3D Scenes by Learning Interaction- and Geometry-Driven Keyframes
James F. Mullen, Divya Kothandaraman, Aniket Bera, Dinesh Manocha
TL;DR
The paper tackles the problem of placing 3D human animations into static 3D scenes while preserving interactions by introducing PAAK, a keyframe-driven framework. It combines Geometric Keyframes and Active Keyframes, using an energy function $E(\tau, \theta)$ that balances scene affordances against penetration losses to guide placement, with per-frame interaction cues from POSA and a BADGE-based diversity mechanism to select informative frames. Evaluations on the PROX dataset with perceptual user studies show PAAK yields more realistic placements than PROX ground truth and competing baselines, demonstrating the value of keyframe-driven optimization over end-to-end or purely geometric approaches. Limitations remain, including occasional unnatural placements, and future work is proposed for multi-person scenarios, end-user quality ratings, and allowing animation-level adjustments to further enhance realism.
Abstract
We present a novel method for placing a 3D human animation into a 3D scene while maintaining any human-scene interactions in the animation. We use the notion of computing the most important meshes in the animation for the interaction with the scene, which we call "keyframes." These keyframes allow us to better optimize the placement of the animation into the scene such that interactions in the animations (standing, laying, sitting, etc.) match the affordances of the scene (e.g., standing on the floor or laying in a bed). We compare our method, which we call PAAK, with prior approaches, including POSA, PROX ground truth, and a motion synthesis method, and highlight the benefits of our method with a perceptual study. Human raters preferred our PAAK method over the PROX ground truth data 64.6\% of the time. Additionally, in direct comparisons, the raters preferred PAAK over competing methods including 61.5\% compared to POSA.
