Table of Contents
Fetching ...

MultiPhys: Multi-Person Physics-aware 3D Motion Estimation

Nicolas Ugrinovic, Boxiao Pan, Georgios Pavlakos, Despoina Paschalidou, Bokui Shen, Jordi Sanchez-Riera, Francesc Moreno-Noguer, Leonidas Guibas

TL;DR

MultiPhys addresses the challenge of recovering physically plausible multi-person motion from monocular RGB by integrating a physics simulator into the reconstruction loop. Starting from SLAHMR's kinematic estimates $\widetilde{\mathbf{q}}^{i}_{t}$, it uses a PPO-based imitation policy to drive $N$ humanoids in Mujoco toward the reference poses while enforcing collision and ground-contact constraints. An iterative loop-$N$ refinement improves stability and fidelity, yielding outputs $\mathbf{q}^{i}_{t}$ that are both kinematically coherent and physically compliant. Experiments on CHI3D, Hi4D, and ExPI show large reductions in penetration and skating with competitive pose accuracy, demonstrating the value of physics-informed multi-person motion estimation; code is publicly available.

Abstract

We introduce MultiPhys, a method designed for recovering multi-person motion from monocular videos. Our focus lies in capturing coherent spatial placement between pairs of individuals across varying degrees of engagement. MultiPhys, being physically aware, exhibits robustness to jittering and occlusions, and effectively eliminates penetration issues between the two individuals. We devise a pipeline in which the motion estimated by a kinematic-based method is fed into a physics simulator in an autoregressive manner. We introduce distinct components that enable our model to harness the simulator's properties without compromising the accuracy of the kinematic estimates. This results in final motion estimates that are both kinematically coherent and physically compliant. Extensive evaluations on three challenging datasets characterized by substantial inter-person interaction show that our method significantly reduces errors associated with penetration and foot skating, while performing competitively with the state-of-the-art on motion accuracy and smoothness. Results and code can be found on our project page (http://www.iri.upc.edu/people/nugrinovic/multiphys/).

MultiPhys: Multi-Person Physics-aware 3D Motion Estimation

TL;DR

MultiPhys addresses the challenge of recovering physically plausible multi-person motion from monocular RGB by integrating a physics simulator into the reconstruction loop. Starting from SLAHMR's kinematic estimates , it uses a PPO-based imitation policy to drive humanoids in Mujoco toward the reference poses while enforcing collision and ground-contact constraints. An iterative loop- refinement improves stability and fidelity, yielding outputs that are both kinematically coherent and physically compliant. Experiments on CHI3D, Hi4D, and ExPI show large reductions in penetration and skating with competitive pose accuracy, demonstrating the value of physics-informed multi-person motion estimation; code is publicly available.

Abstract

We introduce MultiPhys, a method designed for recovering multi-person motion from monocular videos. Our focus lies in capturing coherent spatial placement between pairs of individuals across varying degrees of engagement. MultiPhys, being physically aware, exhibits robustness to jittering and occlusions, and effectively eliminates penetration issues between the two individuals. We devise a pipeline in which the motion estimated by a kinematic-based method is fed into a physics simulator in an autoregressive manner. We introduce distinct components that enable our model to harness the simulator's properties without compromising the accuracy of the kinematic estimates. This results in final motion estimates that are both kinematically coherent and physically compliant. Extensive evaluations on three challenging datasets characterized by substantial inter-person interaction show that our method significantly reduces errors associated with penetration and foot skating, while performing competitively with the state-of-the-art on motion accuracy and smoothness. Results and code can be found on our project page (http://www.iri.upc.edu/people/nugrinovic/multiphys/).
Paper Structure (12 sections, 4 equations, 7 figures, 1 table)

This paper contains 12 sections, 4 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: MultiPhys enables recovering multi-person 3D motion in a physically-aware manner.. State-of-the-art methods (SLAHMR ye2023slahmr, top row) for multi-person motion recovery mostly rely on kinematic approaches, which typically ignore physical constraints, such as body penetration. Note that while individual poses are kinematically coherent, their spatial placement is suboptimal, resulting in significant penetration errors. MultiPhys (bottom row) incorporates physics constraints into the reconstruction process, yielding more physically plausible results.
  • Figure 2: MultiPhys Pipeline. Given an input video with multiple people (left), we first obtain initial kinematic estimates of the camera poses and 3D human motion using SLAHMR ye2023slahmr. Using these initial motion estimates, our proposed framework corrects them and makes them physically plausible (right).
  • Figure 3: Physics-aware Correction Module. We use the policy $\pi$ to control the humanoid agents with the initial kinematic poses. We simulate all agents simultaneously in order to apply physics-based constraints to the reconstructed motion. The policy computes features from both the current state of the simulation and the target pose to later generate the action signal $a$ that controls the agents. We place our loop-N component between target poses ${\widetilde{\mathbf{q}}}^{i}_{t+1}$ that correspond to each video frame.
  • Figure 4: Effect of loop-N component for different values of $N_{l}$. We study the effect of different values of $N_{l} =\{1, ..., 5\}$ on both (a) physics and (b) pose metrics. We report Inter-Person Penetration (measured in m.), the Ground Penetration (measured in mm), the Floor Skating (measured in mm), the W-MPJPE and PA-MPJPE (measured in mm) and the Acceleration Error (measured in mm/s2). We choose $N_{l} =2$ for the rest of the experiments as it provides a good balance between physics and pose metrics, see \ref{['subsec:experiments:ablation']}. Note that we scale Pen. metric by a factor of 1/10 to fit the graph. To see the table for these numbers refer to the Supp. Mat.
  • Figure 5: Qualitative results of the proposed approach. The first three columns (from left to right) are from Hi4D yin2023hi4d and the other three are from CHI3D Fieraru_2020_CVPR. Each row corresponds to one frame of the same sequence. The columns compare the resulting poses at each frame using SLAHMR ye2023slahmr and our method. In these cases of close inter-person interaction, the estimated motion from SLAHMR often has severe inter-person penetrations, while our method is able to eliminate these penetrations through physics-aware correction.
  • ...and 2 more figures