Visual Physics: Discovering Physical Laws from Videos
Pradyumna Chari, Chinmay Talegaonkar, Yunhao Ba, Achuta Kadambi
TL;DR
The paper tackles the challenge of discovering physical laws from video by proposing Visual Physics, a three-component pipeline that jointly learns governing equations and parameters. It combines a Mask R-CNN based position detector, a beta-VAE–driven latent physics module, and a Eureqa–style genetic programming equation discovery that yields symbolic, interpretable formulas. Demonstrations on synthetic and real 2D motion tasks show symbolically accurate expressions and affine mappings between latent nodes and ground-truth parameters, with robustness to noise and varying data sizes. The work advances unsupervised physics discovery from visual data and provides a publicly released dataset to support future research.
Abstract
In this paper, we teach a machine to discover the laws of physics from video streams. We assume no prior knowledge of physics, beyond a temporal stream of bounding boxes. The problem is very difficult because a machine must learn not only a governing equation (e.g. projectile motion) but also the existence of governing parameters (e.g. velocities). We evaluate our ability to discover physical laws on videos of elementary physical phenomena, such as projectile motion or circular motion. These elementary tasks have textbook governing equations and enable ground truth verification of our approach.
