Physical Property Understanding from Language-Embedded Feature Fields
Albert J. Zhai, Yuan Shen, Emily Y. Chen, Gloria X. Wang, Xinlei Wang, Sheng Wang, Kaiyu Guan, Shenlong Wang
TL;DR
The paper introduces NeRF2Physics, a training-free framework that predicts dense physical properties from image collections by constructing a language-embedded 3D feature field. It combines NeRF-derived geometry with CLIP-based per-point features and leverages LLMs to generate a material dictionary, enabling zero-shot regression of properties such as mass, friction, and hardness. The approach includes an object-level aggregation step using LLMS to estimate surface thickness for volumetric properties, and is validated on ABO-500 for mass as well as real-world datasets for friction and hardness, surpassing several baselines and demonstrating robust, annotation-free reasoning in open-world objects. The work advances open-world physical-property understanding with practical implications for digital twins, robotics, and agriculture, and highlights the potential of integrating vision-language models with geometric representations for physics-aware perception.
Abstract
Can computers perceive the physical properties of objects solely through vision? Research in cognitive science and vision science has shown that humans excel at identifying materials and estimating their physical properties based purely on visual appearance. In this paper, we present a novel approach for dense prediction of the physical properties of objects using a collection of images. Inspired by how humans reason about physics through vision, we leverage large language models to propose candidate materials for each object. We then construct a language-embedded point cloud and estimate the physical properties of each 3D point using a zero-shot kernel regression approach. Our method is accurate, annotation-free, and applicable to any object in the open world. Experiments demonstrate the effectiveness of the proposed approach in various physical property reasoning tasks, such as estimating the mass of common objects, as well as other properties like friction and hardness.
