ContPhy: Continuum Physical Concept Learning and Reasoning from Videos
Zhicheng Zheng, Xin Yan, Zhenfang Chen, Jingzhou Wang, Qin Zhi Eddie Lim, Joshua B. Tenenbaum, Chuang Gan
TL;DR
ContPhy introduces a Continuum Physical Dataset to probe machine physical commonsense across soft bodies, fluids, and their interactions with rigid and articulated objects. It pairs a Unity-based physics generation pipeline with a designed question engine to produce diverse property and dynamics questions, and evaluates a spectrum of baselines (blind, visual, physical, MLLMs) alongside ContPRO, a neural–symbolic oracle that uses LLMs as program parsers and particle-based dynamics as the simulator. Across scenarios, current models show substantial gaps versus humans, especially for soft materials and fluids, while ContPRO achieves the strongest performance and even surpasses human accuracy on some predictive tasks. The work underscores the value of neural–symbolic approaches for integrating perception, simulation, and language reasoning to advance physical commonsense in continuum settings, and outlines directions for expanding language diversity and scene complexity in future benchmarks.
Abstract
We introduce the Continuum Physical Dataset (ContPhy), a novel benchmark for assessing machine physical commonsense. ContPhy complements existing physical reasoning benchmarks by encompassing the inference of diverse physical properties, such as mass and density, across various scenarios and predicting corresponding dynamics. We evaluated a range of AI models and found that they still struggle to achieve satisfactory performance on ContPhy, which shows that the current AI models still lack physical commonsense for the continuum, especially soft-bodies, and illustrates the value of the proposed dataset. We also introduce an oracle model (ContPRO) that marries the particle-based physical dynamic models with the recent large language models, which enjoy the advantages of both models, precise dynamic predictions, and interpretable reasoning. ContPhy aims to spur progress in perception and reasoning within diverse physical settings, narrowing the divide between human and machine intelligence in understanding the physical world. Project page: https://physical-reasoning-project.github.io
