Table of Contents
Fetching ...

MatSpray: Fusing 2D Material World Knowledge on 3D Geometry

Philipp Langsteiner, Jan-Niklas Dihlmann, Hendrik P. A. Lensch

TL;DR

MatSpray presents a unified framework that fuses 2D diffusion-based material priors with 3D Gaussian Splatting to produce relightable, spatially varying PBR materials for reconstructed scenes. It transfers per-view 2D material predictions into a 3D Gaussian representation via Gaussian ray tracing, then refines them with a Neural Merger that enforces cross-view consistency under deferred shading losses. The combination of diffusion priors, 3D ray-based lifting, and softmax-based material fusion yields higher-quality relighting and more accurate metallic/roughness estimates than prior 3D-material methods, while significantly reducing per-scene optimization time. This approach enables faster creation of photorealistic, relightable 3D assets suitable for production pipelines and real-world capture scenarios.

Abstract

Manual modeling of material parameters and 3D geometry is a time consuming yet essential task in the gaming and film industries. While recent advances in 3D reconstruction have enabled accurate approximations of scene geometry and appearance, these methods often fall short in relighting scenarios due to the lack of precise, spatially varying material parameters. At the same time, diffusion models operating on 2D images have shown strong performance in predicting physically based rendering (PBR) properties such as albedo, roughness, and metallicity. However, transferring these 2D material maps onto reconstructed 3D geometry remains a significant challenge. We propose a framework for fusing 2D material data into 3D geometry using a combination of novel learning-based and projection-based approaches. We begin by reconstructing scene geometry via Gaussian Splatting. From the input images, a diffusion model generates 2D maps for albedo, roughness, and metallic parameters. Any existing diffusion model that can convert images or videos to PBR materials can be applied. The predictions are further integrated into the 3D representation either by optimizing an image-based loss or by directly projecting the material parameters onto the Gaussians using Gaussian ray tracing. To enhance fine-scale accuracy and multi-view consistency, we further introduce a light-weight neural refinement step (Neural Merger), which takes ray-traced material features as input and produces detailed adjustments. Our results demonstrate that the proposed methods outperform existing techniques in both quantitative metrics and perceived visual realism. This enables more accurate, relightable, and photorealistic renderings from reconstructed scenes, significantly improving the realism and efficiency of asset creation workflows in content production pipelines.

MatSpray: Fusing 2D Material World Knowledge on 3D Geometry

TL;DR

MatSpray presents a unified framework that fuses 2D diffusion-based material priors with 3D Gaussian Splatting to produce relightable, spatially varying PBR materials for reconstructed scenes. It transfers per-view 2D material predictions into a 3D Gaussian representation via Gaussian ray tracing, then refines them with a Neural Merger that enforces cross-view consistency under deferred shading losses. The combination of diffusion priors, 3D ray-based lifting, and softmax-based material fusion yields higher-quality relighting and more accurate metallic/roughness estimates than prior 3D-material methods, while significantly reducing per-scene optimization time. This approach enables faster creation of photorealistic, relightable 3D assets suitable for production pipelines and real-world capture scenarios.

Abstract

Manual modeling of material parameters and 3D geometry is a time consuming yet essential task in the gaming and film industries. While recent advances in 3D reconstruction have enabled accurate approximations of scene geometry and appearance, these methods often fall short in relighting scenarios due to the lack of precise, spatially varying material parameters. At the same time, diffusion models operating on 2D images have shown strong performance in predicting physically based rendering (PBR) properties such as albedo, roughness, and metallicity. However, transferring these 2D material maps onto reconstructed 3D geometry remains a significant challenge. We propose a framework for fusing 2D material data into 3D geometry using a combination of novel learning-based and projection-based approaches. We begin by reconstructing scene geometry via Gaussian Splatting. From the input images, a diffusion model generates 2D maps for albedo, roughness, and metallic parameters. Any existing diffusion model that can convert images or videos to PBR materials can be applied. The predictions are further integrated into the 3D representation either by optimizing an image-based loss or by directly projecting the material parameters onto the Gaussians using Gaussian ray tracing. To enhance fine-scale accuracy and multi-view consistency, we further introduce a light-weight neural refinement step (Neural Merger), which takes ray-traced material features as input and produces detailed adjustments. Our results demonstrate that the proposed methods outperform existing techniques in both quantitative metrics and perceived visual realism. This enables more accurate, relightable, and photorealistic renderings from reconstructed scenes, significantly improving the realism and efficiency of asset creation workflows in content production pipelines.

Paper Structure

This paper contains 30 sections, 9 equations, 12 figures, 4 tables.

Figures (12)

  • Figure 1: MatSpray Overview we utilize 2D material world knowlegde from 2D diffusion models to reconstruct 3D relightable objects. Given multi-view images of a target object, we first generate per-view PBR material predictions (base color, roughness, metallic) using any 2D diffusion-based material model. These 2D estimates are then integrated into a 3D Gaussian Splatting reconstruction via Gaussian ray tracing. Finally, a neural refinement stage applies a softmax-based restriction to enforce multi-view consistency and enhance the physical accuracy of the materials. The resulting 3D assets feature high-quality, fully relightable PBR materials under novel illumination. Project page: https://matspray.jdihlmann.com/
  • Figure 2: Pipeline. From multi-view images, a diffusion predictor yields per-view material maps. We reconstruct the object's geometry using 3D Gaussian Splatting. Then we project 2D materials to 3D via ray tracing, and refine per Gaussian materials with our Neural Merger that has a softmax output layer, choosing between the projected values. We then supervise the produced material maps using the predicted 2D material maps. Additionally, using deferred shading we supervise by a PBR-based photometric rendering loss with the multi-view ground truth images of the object.
  • Figure 3: Relighting Comparison between our method, an extended version of R3DGS gao:24 and IRGS gu:25. The objects are all relit under the same environment maps. In IRGS, reconstructed scene geometry might partially occlude the environment map.
  • Figure 4: Material Maps produced by our method compared to extended R3DGS gao:24, which can also predict metallic material maps, IRGS gu:25, and the DiffusionRenderer material output produced on the test images that are not used for training. We show four images each, where the top left is the base color, top right is the roughness, bottom left is metallic, and bottom right are the normals.
  • Figure 5: Real-World Comparison of our method, extended R3DGS gao:24 and IRGS gu:25. The ground truth images show the object masked (top) and unmasked (bottom) to give a better understanding of the object and the surrounding lighting.
  • ...and 7 more figures