MCMat: Multiview-Consistent and Physically Accurate PBR Material Generation

Shenhao Zhu; Lingteng Qiu; Xiaodong Gu; Zhengyi Zhao; Chao Xu; Yuxiao He; Zhe Li; Xiaoguang Han; Yao Yao; Xun Cao; Siyu Zhu; Weihao Yuan; Zilong Dong; Hao Zhu

MCMat: Multiview-Consistent and Physically Accurate PBR Material Generation

Shenhao Zhu, Lingteng Qiu, Xiaodong Gu, Zhengyi Zhao, Chao Xu, Yuxiao He, Zhe Li, Xiaoguang Han, Yao Yao, Xun Cao, Siyu Zhu, Weihao Yuan, Zilong Dong, Hao Zhu

TL;DR

This work introduces MCMat, a two-stage framework for multi-view-consistent and physically accurate PBR material generation for 3D models conditioned on text or reference images. The generation stage employs MG-DiT, a multi-view diffusion transformer with geometric conditioning from surface normals and a reference-based block to ensure cross-view consistency and fidelity to references, guided by a PBR-based diffusion loss. The refinement stage uses MR-DiT to convert incomplete, low-resolution multi-view outputs into high-quality 2K UV-space textures through inpainting and detail enhancement, leveraging a coarse texture map and normal cues. Experiments on a large 3D model dataset demonstrate state-of-the-art performance in text-to-PBR material generation and relighting, with significant improvements in realism, fidelity, and generalization under varying lighting conditions.

Abstract

Existing 2D methods utilize UNet-based diffusion models to generate multi-view physically-based rendering (PBR) maps but struggle with multi-view inconsistency, while some 3D methods directly generate UV maps, encountering generalization issues due to the limited 3D data. To address these problems, we propose a two-stage approach, including multi-view generation and UV materials refinement. In the generation stage, we adopt a Diffusion Transformer (DiT) model to generate PBR materials, where both the specially designed multi-branch DiT and reference-based DiT blocks adopt a global attention mechanism to promote feature interaction and fusion between different views, thereby improving multi-view consistency. In addition, we adopt a PBR-based diffusion loss to ensure that the generated materials align with realistic physical principles. In the refinement stage, we propose a material-refined DiT that performs inpainting in empty areas and enhances details in UV space. Except for the normal condition, this refinement also takes the material map from the generation stage as an additional condition to reduce the learning difficulty and improve generalization. Extensive experiments show that our method achieves state-of-the-art performance in texturing 3D objects with PBR materials and provides significant advantages for graphics relighting applications. Project Page: https://lingtengqiu.github.io/2024/MCMat/

MCMat: Multiview-Consistent and Physically Accurate PBR Material Generation

TL;DR

Abstract

MCMat: Multiview-Consistent and Physically Accurate PBR Material Generation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (13)