Table of Contents
Fetching ...

MatMart: Material Reconstruction of 3D Objects via Diffusion

Xiuchao Wu, Pengfei Zhu, Jiangjing Lyu, Xinguo Liu, Jie Guo, Yanwen Guo, Weiwei Xu, Chengfei Lyu

TL;DR

MatMart tackles the challenge of reconstructing physically-based rendering materials for 3D objects from RGB images by leveraging diffusion models in a two-stage pipeline. It first performs progressive material estimation with UV-space baking, then uses adaptive view selection and prior-guided generation to fill occluded regions, all within a single end-to-end trainable diffusion model. A core contribution is the view-material cross-attention (VMCA), which enables multi-view consistency under progressive inference and reduces memory demands. Comprehensive experiments on Objaverse show superior material prediction and generation quality, with ablations confirming the effectiveness of VMCA, material priors, and texture baking. The approach offers a scalable, stable alternative to multi-model pipelines, enabling high-resolution, view-flexible material reconstruction for practical applications.

Abstract

Applying diffusion models to physically-based material estimation and generation has recently gained prominence. In this paper, we propose \ttt, a novel material reconstruction framework for 3D objects, offering the following advantages. First, \ttt\ adopts a two-stage reconstruction, starting with accurate material prediction from inputs and followed by prior-guided material generation for unobserved views, yielding high-fidelity results. Second, by utilizing progressive inference alongside the proposed view-material cross-attention (VMCA), \ttt\ enables reconstruction from an arbitrary number of input images, demonstrating strong scalability and flexibility. Finally, \ttt\ achieves both material prediction and generation capabilities through end-to-end optimization of a single diffusion model, without relying on additional pre-trained models, thereby exhibiting enhanced stability across various types of objects. Extensive experiments demonstrate that \ttt\ achieves superior performance in material reconstruction compared to existing methods.

MatMart: Material Reconstruction of 3D Objects via Diffusion

TL;DR

MatMart tackles the challenge of reconstructing physically-based rendering materials for 3D objects from RGB images by leveraging diffusion models in a two-stage pipeline. It first performs progressive material estimation with UV-space baking, then uses adaptive view selection and prior-guided generation to fill occluded regions, all within a single end-to-end trainable diffusion model. A core contribution is the view-material cross-attention (VMCA), which enables multi-view consistency under progressive inference and reduces memory demands. Comprehensive experiments on Objaverse show superior material prediction and generation quality, with ablations confirming the effectiveness of VMCA, material priors, and texture baking. The approach offers a scalable, stable alternative to multi-model pipelines, enabling high-resolution, view-flexible material reconstruction for practical applications.

Abstract

Applying diffusion models to physically-based material estimation and generation has recently gained prominence. In this paper, we propose \ttt, a novel material reconstruction framework for 3D objects, offering the following advantages. First, \ttt\ adopts a two-stage reconstruction, starting with accurate material prediction from inputs and followed by prior-guided material generation for unobserved views, yielding high-fidelity results. Second, by utilizing progressive inference alongside the proposed view-material cross-attention (VMCA), \ttt\ enables reconstruction from an arbitrary number of input images, demonstrating strong scalability and flexibility. Finally, \ttt\ achieves both material prediction and generation capabilities through end-to-end optimization of a single diffusion model, without relying on additional pre-trained models, thereby exhibiting enhanced stability across various types of objects. Extensive experiments demonstrate that \ttt\ achieves superior performance in material reconstruction compared to existing methods.

Paper Structure

This paper contains 25 sections, 2 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Reconstructed results in single-view input: Our method reconstructs high-quality materials across various types of objects.
  • Figure 2: Method overview. Our framework, MatMart, divides the material reconstruction task into two stages. In the first stage, progressive material prediction is performed on the input images, and the predicted results are baked into the UV space. In the second stage, prior-guided material generation and texture baking are alternately conducted for unobserved and occluded regions. Both prediction and generation tasks are unified within a single diffusion model and can be accomplished through end-to-end optimization.
  • Figure 3: The effect of VMCA on predicted results. VMCA improves the consistency for progressive material estimation.
  • Figure 4: Generation with high resolution improves texture-geometry alignment. The top right displays the blended results of normal and generated albedo to visualize the alignment.
  • Figure 5: Qualitative comparison of single-view input. Our method recovers more accurate materials across various types of objects, thereby achieving rendering results that are more consistent with the ground truth. From left to right are albedo, roughness, and metallic.
  • ...and 5 more figures