Table of Contents
Fetching ...

M-PhyGs: Multi-Material Object Dynamics from Video

Norika Wada, Kohei Yamashita, Ryo Kawahara, Ko Nishino

TL;DR

M-PhyGs tackles the challenging problem of estimating physically meaningful parameters for real, multi-material deformable objects from short videos. It introduces a hybrid representation with 3D Gaussian splatting and dense particles, enabling gravity-aware joint segmentation and parameter estimation of multiple materials. The method relies on cascaded 3D and 2D supervision and temporal mini-batching to stabilize optimization, and it validates its effectiveness on the Phlowers dataset, achieving state-of-the-art dynamics prediction and material estimation. This work advances embodied vision by enabling physics-informed understanding of natural objects like flowers, with practical potential for robotics and manipulation planning. Phlowers provides a valuable benchmark for future multi-material dynamics research.

Abstract

Knowledge of the physical material properties governing the dynamics of a real-world object becomes necessary to accurately anticipate its response to unseen interactions. Existing methods for estimating such physical material parameters from visual data assume homogeneous single-material objects, pre-learned dynamics, or simplistic topologies. Real-world objects, however, are often complex in material composition and geometry lying outside the realm of these assumptions. In this paper, we particularly focus on flowers as a representative common object. We introduce Multi-material Physical Gaussians (M-PhyGs) to estimate the material composition and parameters of such multi-material complex natural objects from video. From a short video captured in a natural setting, M-PhyGs jointly segments the object into similar materials and recovers their continuum mechanical parameters while accounting for gravity. M-PhyGs achieves this efficiently with newly introduced cascaded 3D and 2D losses, and by leveraging temporal mini-batching. We introduce a dataset, Phlowers, of people interacting with flowers as a novel platform to evaluate the accuracy of this challenging task of multi-material physical parameter estimation. Experimental results on Phlowers dataset demonstrate the accuracy and effectiveness of M-PhyGs and its components.

M-PhyGs: Multi-Material Object Dynamics from Video

TL;DR

M-PhyGs tackles the challenging problem of estimating physically meaningful parameters for real, multi-material deformable objects from short videos. It introduces a hybrid representation with 3D Gaussian splatting and dense particles, enabling gravity-aware joint segmentation and parameter estimation of multiple materials. The method relies on cascaded 3D and 2D supervision and temporal mini-batching to stabilize optimization, and it validates its effectiveness on the Phlowers dataset, achieving state-of-the-art dynamics prediction and material estimation. This work advances embodied vision by enabling physics-informed understanding of natural objects like flowers, with practical potential for robotics and manipulation planning. Phlowers provides a valuable benchmark for future multi-material dynamics research.

Abstract

Knowledge of the physical material properties governing the dynamics of a real-world object becomes necessary to accurately anticipate its response to unseen interactions. Existing methods for estimating such physical material parameters from visual data assume homogeneous single-material objects, pre-learned dynamics, or simplistic topologies. Real-world objects, however, are often complex in material composition and geometry lying outside the realm of these assumptions. In this paper, we particularly focus on flowers as a representative common object. We introduce Multi-material Physical Gaussians (M-PhyGs) to estimate the material composition and parameters of such multi-material complex natural objects from video. From a short video captured in a natural setting, M-PhyGs jointly segments the object into similar materials and recovers their continuum mechanical parameters while accounting for gravity. M-PhyGs achieves this efficiently with newly introduced cascaded 3D and 2D losses, and by leveraging temporal mini-batching. We introduce a dataset, Phlowers, of people interacting with flowers as a novel platform to evaluate the accuracy of this challenging task of multi-material physical parameter estimation. Experimental results on Phlowers dataset demonstrate the accuracy and effectiveness of M-PhyGs and its components.

Paper Structure

This paper contains 30 sections, 19 equations, 13 figures, 3 tables.

Figures (13)

  • Figure 2: Overview of M-PhyGs. From dense multi-view images of a multi-material deformable object in a static state, we first recover a set of 3D Gaussians and uniformly distribute 3D particles inside the object. From a short video capturing physical interactions with the object captured from a sparse set of views, M-PhyGs estimates the physical material parameters (Young's modulus and density) of these particles which drive the 3D Gaussians. This estimation is achieved by minimization of discrepancies between the predicted and observed dynamics first in 3D geometry by assuming local rigidity and then in the 2D image plane with full non-rigid dynamics.
  • Figure 3: M-PhyGs accounts for gravity in the forward dynamics computation by adding a rotated initial internal force to counter the gravitational force at rest shape.
  • Figure 4: M-PhyGs leverages DINO simeoni2025dinov3 and GARField (affinity) cmk2024garfield features assigned to each 3D particle for initial fine-grained material segmentation. It estimates per-segment physical material parameters and encourages further merging of the segments with a material grouping loss.
  • Figure 5: Estimated physical material parameters of our method and existing methods lin25omniphysgszhang24physdreamerle2025pixiezhong24springgauszhang2024dynamics. For OmniPhysGS lin25omniphysgs, the constitutive model (i.e., probability of whether the particle is elastic or not), and for gs-dynamics zhang2024dynamics, the node features are shown, respectively. The estimated per-segment material parameters of M-PhyGs form clusters that roughly align with the different object parts.
  • Figure 6: Unseen (in-sequence) dynamics predicted with estimated physical material parameters. For each object, the first two show samples of training frames and the last three show samples of predicted frames. M-PhyGs predicts the complex motion of each flower which aligns with the actual held out observations. In contrast, existing methods fatally diverge from the true motion often completely collapsing due to erroneous material estimates.
  • ...and 8 more figures