Table of Contents
Fetching ...

Material Anything: Generating Materials for Any 3D Object via Diffusion

Xin Huang, Tengfei Wang, Ziwei Liu, Qing Wang

TL;DR

This work presents Material Anything, a fully-automated, unified diffusion framework designed to generate physically-based materials for 3D objects, which leverages a pre-trained image diffusion model enhanced with a triple-head architecture and rendering loss to improve stability and material quality.

Abstract

We present Material Anything, a fully-automated, unified diffusion framework designed to generate physically-based materials for 3D objects. Unlike existing methods that rely on complex pipelines or case-specific optimizations, Material Anything offers a robust, end-to-end solution adaptable to objects under diverse lighting conditions. Our approach leverages a pre-trained image diffusion model, enhanced with a triple-head architecture and rendering loss to improve stability and material quality. Additionally, we introduce confidence masks as a dynamic switcher within the diffusion model, enabling it to effectively handle both textured and texture-less objects across varying lighting conditions. By employing a progressive material generation strategy guided by these confidence masks, along with a UV-space material refiner, our method ensures consistent, UV-ready material outputs. Extensive experiments demonstrate our approach outperforms existing methods across a wide range of object categories and lighting conditions.

Material Anything: Generating Materials for Any 3D Object via Diffusion

TL;DR

This work presents Material Anything, a fully-automated, unified diffusion framework designed to generate physically-based materials for 3D objects, which leverages a pre-trained image diffusion model enhanced with a triple-head architecture and rendering loss to improve stability and material quality.

Abstract

We present Material Anything, a fully-automated, unified diffusion framework designed to generate physically-based materials for 3D objects. Unlike existing methods that rely on complex pipelines or case-specific optimizations, Material Anything offers a robust, end-to-end solution adaptable to objects under diverse lighting conditions. Our approach leverages a pre-trained image diffusion model, enhanced with a triple-head architecture and rendering loss to improve stability and material quality. Additionally, we introduce confidence masks as a dynamic switcher within the diffusion model, enabling it to effectively handle both textured and texture-less objects across varying lighting conditions. By employing a progressive material generation strategy guided by these confidence masks, along with a UV-space material refiner, our method ensures consistent, UV-ready material outputs. Extensive experiments demonstrate our approach outperforms existing methods across a wide range of object categories and lighting conditions.

Paper Structure

This paper contains 21 sections, 3 equations, 19 figures, 3 tables.

Figures (19)

  • Figure 1: Overview of Material Anything. For texture-less objects, we first generate coarse textures using image diffusion models, similar to the texture generation method chen2023text2tex. For objects with pre-existing textures, we directly process them. Next, a material estimator progressively estimates materials for each view from a rendered image, normal, and confidence mask. The confidence mask serves as additional guidance for illuminance uncertainty, addressing lighting variations in the input image and enhancing consistency across generated multi-view materials. These materials are then unwrapped into UV space and refined by a material refiner. The final material maps are integrated with the mesh, enabling the object for downstream applications.
  • Figure 2: Architectural design of material estimator and refiner. Both employ a triple-head U-Net, generating albedo, roughness-metallic, and bump maps via separate branches.
  • Figure 3: Progressive material generation process for a texture-less object. "Project" denotes projecting known regions for the latent initialization of the next view. "SD" denotes the pre-trained stable diffusion model rombach2022high with depth ControlNet zhang2023adding
  • Figure 4: Comparisons with texture generation methods. These methods directly paint texture-less objects using image diffusion models but fail to generate the corresponding material properties.
  • Figure 5: Comparisons with optimization methods. NvDiffRec munkberg2022extracting estimates materials using the textured model by SyncMVD liu2023text as input. The materials include albedo (top left); roughness (top right); metallic (bottom left); bump (bottom right).
  • ...and 14 more figures