Table of Contents
Fetching ...

GaussianProperty: Integrating Physical Properties to 3D Gaussians with LMMs

Xinli Xu, Wenhang Ge, Dicong Qiu, ZhiFei Chen, Dongyu Yan, Zhuoyun Liu, Haoyu Zhao, Hanfeng Zhao, Shunsi Zhang, Junwei Liang, Ying-Cong Chen

TL;DR

GaussianProperty presents a training-free pipeline that attaches physical properties to 3D Gaussians by integrating SAM-based segmentation with GPT-4V-driven material reasoning. It uses a global-local 2D reasoning module and a multi-view voting scheme to lift 2D material estimates to 3D Gaussians, enabling physics-based dynamics via the Material Point Method and material-aware grasping with adaptive force bounds. The core contributions include a part-level segmentation strategy, a global-local reasoning framework with gradual prompting, a 2D-to-3D voting mechanism, and demonstrated improvements in material segmentation, dynamics, and grasping on ABO, MVImgNet, and real-world objects, with open-source resources available. The approach offers a practical pathway to inferring and exploiting physical properties from visual data for robotics and simulation tasks, reducing annotation overhead and enabling scalable dynamic rendering and manipulation.

Abstract

Estimating physical properties for visual data is a crucial task in computer vision, graphics, and robotics, underpinning applications such as augmented reality, physical simulation, and robotic grasping. However, this area remains under-explored due to the inherent ambiguities in physical property estimation. To address these challenges, we introduce GaussianProperty, a training-free framework that assigns physical properties of materials to 3D Gaussians. Specifically, we integrate the segmentation capability of SAM with the recognition capability of GPT-4V(ision) to formulate a global-local physical property reasoning module for 2D images. Then we project the physical properties from multi-view 2D images to 3D Gaussians using a voting strategy. We demonstrate that 3D Gaussians with physical property annotations enable applications in physics-based dynamic simulation and robotic grasping. For physics-based dynamic simulation, we leverage the Material Point Method (MPM) for realistic dynamic simulation. For robot grasping, we develop a grasping force prediction strategy that estimates a safe force range required for object grasping based on the estimated physical properties. Extensive experiments on material segmentation, physics-based dynamic simulation, and robotic grasping validate the effectiveness of our proposed method, highlighting its crucial role in understanding physical properties from visual data. Online demo, code, more cases and annotated datasets are available on \href{https://Gaussian-Property.github.io}{this https URL}.

GaussianProperty: Integrating Physical Properties to 3D Gaussians with LMMs

TL;DR

GaussianProperty presents a training-free pipeline that attaches physical properties to 3D Gaussians by integrating SAM-based segmentation with GPT-4V-driven material reasoning. It uses a global-local 2D reasoning module and a multi-view voting scheme to lift 2D material estimates to 3D Gaussians, enabling physics-based dynamics via the Material Point Method and material-aware grasping with adaptive force bounds. The core contributions include a part-level segmentation strategy, a global-local reasoning framework with gradual prompting, a 2D-to-3D voting mechanism, and demonstrated improvements in material segmentation, dynamics, and grasping on ABO, MVImgNet, and real-world objects, with open-source resources available. The approach offers a practical pathway to inferring and exploiting physical properties from visual data for robotics and simulation tasks, reducing annotation overhead and enabling scalable dynamic rendering and manipulation.

Abstract

Estimating physical properties for visual data is a crucial task in computer vision, graphics, and robotics, underpinning applications such as augmented reality, physical simulation, and robotic grasping. However, this area remains under-explored due to the inherent ambiguities in physical property estimation. To address these challenges, we introduce GaussianProperty, a training-free framework that assigns physical properties of materials to 3D Gaussians. Specifically, we integrate the segmentation capability of SAM with the recognition capability of GPT-4V(ision) to formulate a global-local physical property reasoning module for 2D images. Then we project the physical properties from multi-view 2D images to 3D Gaussians using a voting strategy. We demonstrate that 3D Gaussians with physical property annotations enable applications in physics-based dynamic simulation and robotic grasping. For physics-based dynamic simulation, we leverage the Material Point Method (MPM) for realistic dynamic simulation. For robot grasping, we develop a grasping force prediction strategy that estimates a safe force range required for object grasping based on the estimated physical properties. Extensive experiments on material segmentation, physics-based dynamic simulation, and robotic grasping validate the effectiveness of our proposed method, highlighting its crucial role in understanding physical properties from visual data. Online demo, code, more cases and annotated datasets are available on \href{https://Gaussian-Property.github.io}{this https URL}.

Paper Structure

This paper contains 42 sections, 11 equations, 18 figures, 5 tables.

Figures (18)

  • Figure 1: GaussianProperty is a training-free framework, aiming at adding physical properties to 3D Gaussians with the assistance of LMMs. By assigning physical properties to 3D Gaussians, it promotes several downstream tasks such as physical-based generative dynamics and robot grasping in this work.
  • Figure 2: Overall pipeline. Our Gausssian-Property initially leverages SAM to get the segmentation map of the object. Then the original images and the masks are sent to the foundation models like GPT-4V(ision) to get the corresponding physical properties by inquiring the material candidates. After acquiring physical properties from 2D images, we using a multi-view approach and a voting strategy to add physical properties to the reconstruction 3D Gaussians.
  • Figure 3: Left: GPT-4V(ision) struggles to recognize the material when directly provided with both global and partial image inputs. Right: Enhanced with combined global-local information and association, the agent accurately characterizes the component's properties.
  • Figure 4: Qualitative results of Material Segmentation. Our model makes boundary-accurate physical material predictions.
  • Figure 5: Generative Dynamics. We present a potential downstream task of 3D Gaussians with physical property, i.e., the generative dynamics. By imposing force, the 3D Gaussians generate corresponding motion. For example, in the first row, we applied a top-down force, the chair exhibited a movement corresponding to the applied force.
  • ...and 13 more figures