LLM-Guided Material Inference for 3D Point Clouds
Nafiseh Izadyar, Teseo Schneider
TL;DR
This work tackles the lack of material and appearance annotations for 3D shapes by introducing a two-stage, zero-shot LLM-driven framework that first extracts object semantics from coarse geometry and then assigns materials to each geometric segment conditioned on those semantics. The approach leverages multi-view renderings and LLM prompts, and evaluates material plausibility using an LLM-as-a-judge framework (DeepEval) across 1,000 shapes from Fusion/ABS and ShapeNet. Results show high semantic accuracy and strong per-segment material plausibility, highlighting the potential of LLM priors to bridge geometric reasoning and appearance understanding in 3D data without labeled materials. This work opens avenues for material-aware 3D perception that can benefit photorealistic rendering and robotics without requiring extensive material annotations.
Abstract
Most existing 3D shape datasets and models focus solely on geometry, overlooking the material properties that determine how objects appear. We introduce a two-stage large language model (LLM) based method for inferring material composition directly from 3D point clouds with coarse segmentations. Our key insight is to decouple reasoning about what an object is from what it is made of. In the first stage, an LLM predicts the object's semantic; in the second stage, it assigns plausible materials to each geometric segment, conditioned on the inferred semantics. Both stages operate in a zero-shot manner, without task-specific training. Because existing datasets lack reliable material annotations, we evaluate our method using an LLM-as-a-Judge implemented in DeepEval. Across 1,000 shapes from Fusion/ABS and ShapeNet, our method achieves high semantic and material plausibility. These results demonstrate that language models can serve as general-purpose priors for bridging geometric reasoning and material understanding in 3D data.
