Video-Driven Graph Network-Based Simulators
Franciszek Szewczyk, Gilles Louppe, Matthia Sabatelli
TL;DR
The paper addresses inferring physical properties from short videos to drive a Graph Network-based Simulator (GNS) without explicit parameter inputs. It introduces the Video-Driven Graph Network-based Simulator (VDGNS), which combines a Video Encoder that outputs a latent physical encoding $P$ with a GNS that predicts accelerations via a semi-implicit Euler integrator; the model is trained end-to-end by regressing final accelerations while decoupling motion cues from video content. Empirical results on four Taichi-MPM material classes (water, sand, snow, elastic) show that video encodings separate material properties and exhibit a strong linear relationship with predicted motion, with performance close to a Baseline that uses explicit encodings and with robustness to noise. The findings suggest that video-driven encodings can enable realistic, adaptable physics simulations for design, animation, and gaming, and point to future directions toward unsupervised encodings and real-world video data.
Abstract
Lifelike visualizations in design, cinematography, and gaming rely on precise physics simulations, typically requiring extensive computational resources and detailed physical input. This paper presents a method that can infer a system's physical properties from a short video, eliminating the need for explicit parameter input, provided it is close to the training condition. The learned representation is then used within a Graph Network-based Simulator to emulate the trajectories of physical systems. We demonstrate that the video-derived encodings effectively capture the physical properties of the system and showcase a linear dependence between some of the encodings and the system's motion.
