Phys4DGen: Physics-Compliant 4D Generation with Multi-Material Composition Perception
Jiajing Lin, Zhenzhong Wang, Dejun Xu, Shu Jiang, YunPeng Gong, Min Jiang
TL;DR
Phys4DGen tackles the lack of physical realism and multi-material handling in 4D generation by integrating a multi-material perception pipeline with physics-based simulation. It introduces 3D Gaussians generation, 3D Material Grouping, Physical Internal Structure Discovery, and MLLMs-guided material identification (via GPT-4o and CLIP fusion) to create a material-continuum representation for MPM-based dynamics. The approach yields physically plausible, high-fidelity 4D content from a single image or a 3D input and outperforms state-of-the-art methods in spatiotemporal consistency and realism, while enabling fine-grained material control. This framework enables practical, user-friendly physics-aware 4D content generation with potential applications in animation, gaming, and AR/VR, and opens avenues for extending to multi-object scenes.
Abstract
4D content generation aims to create dynamically evolving 3D content that responds to specific input objects such as images or 3D representations. Current approaches typically incorporate physical priors to animate 3D representations, but these methods suffer from significant limitations: they not only require users lacking physics expertise to manually specify material properties but also struggle to effectively handle the generation of multi-material composite objects. To address these challenges, we propose Phys4DGen, a novel 4D generation framework that integrates multi-material composition perception with physical simulation. The framework achieves automated, physically plausible 4D generation through three innovative modules: first, the 3D Material Grouping module partitions heterogeneous material regions on 3D representations' surfaces via semantic segmentation; second, the Internal Physical Structure Discovery module constructs the mechanical structure of object interiors; finally, we distill physical prior knowledge from multimodal large language models to enable rapid and automatic material properties identification for both objects' surfaces and interiors. Experiments on both synthetic and real-world datasets demonstrate that Phys4DGen can generate high-fidelity 4D content with physical realism in open-world scenarios, significantly outperforming state-of-the-art methods.
