NIFTY: Neural Object Interaction Fields for Guided Human Motion Synthesis
Nilesh Kulkarni, Davis Rempe, Kyle Genova, Abhijit Kundu, Justin Johnson, David Fouhey, Leonidas Guibas
TL;DR
NIFTY tackles realistic 3D human-object interaction synthesis by coupling a neural object interaction field with an object-conditioned diffusion model and a scalable synthetic data pipeline. The method uses a SMPL-based pose diffusion conditioned on object geometry, guided by a learned field that encodes the interaction manifold, and is trained with large-scale synthetic data generated from a small set of anchor poses via reverse-time HuMoR rollouts. Key contributions include the object interaction field, the diffusion-guided sampling framework, and the automated data-generation pipeline, which collectively yield higher-quality, more plausible interactions (e.g., sitting and lifting) across diverse objects, with favorable quantitative metrics and user study results. This approach reduces data requirements for learning human-object interactions and enables flexible, object-aware motion synthesis in realistic scenes.
Abstract
We address the problem of generating realistic 3D motions of humans interacting with objects in a scene. Our key idea is to create a neural interaction field attached to a specific object, which outputs the distance to the valid interaction manifold given a human pose as input. This interaction field guides the sampling of an object-conditioned human motion diffusion model, so as to encourage plausible contacts and affordance semantics. To support interactions with scarcely available data, we propose an automated synthetic data pipeline. For this, we seed a pre-trained motion model, which has priors for the basics of human movement, with interaction-specific anchor poses extracted from limited motion capture data. Using our guided diffusion model trained on generated synthetic data, we synthesize realistic motions for sitting and lifting with several objects, outperforming alternative approaches in terms of motion quality and successful action completion. We call our framework NIFTY: Neural Interaction Fields for Trajectory sYnthesis.
