Table of Contents
Fetching ...

Boosting 3D Object Generation through PBR Materials

Yitong Wang, Xudong Xu, Li Ma, Haoran Wang, Bo Dai

TL;DR

This work proposes a novel approach to boost the quality of generated 3D objects from the perspective of Physics-Based Rendering (PBR) materials, leveraging Stable Diffusion fine-tuned on synthetic data to extract these values and adopting a semi-automatic process to provide room for interactive adjustment.

Abstract

Automatic 3D content creation has gained increasing attention recently, due to its potential in various applications such as video games, film industry, and AR/VR. Recent advancements in diffusion models and multimodal models have notably improved the quality and efficiency of 3D object generation given a single RGB image. However, 3D objects generated even by state-of-the-art methods are still unsatisfactory compared to human-created assets. Considering only textures instead of materials makes these methods encounter challenges in photo-realistic rendering, relighting, and flexible appearance editing. And they also suffer from severe misalignment between geometry and high-frequency texture details. In this work, we propose a novel approach to boost the quality of generated 3D objects from the perspective of Physics-Based Rendering (PBR) materials. By analyzing the components of PBR materials, we choose to consider albedo, roughness, metalness, and bump maps. For albedo and bump maps, we leverage Stable Diffusion fine-tuned on synthetic data to extract these values, with novel usages of these fine-tuned models to obtain 3D consistent albedo UV and bump UV for generated objects. In terms of roughness and metalness maps, we adopt a semi-automatic process to provide room for interactive adjustment, which we believe is more practical. Extensive experiments demonstrate that our model is generally beneficial for various state-of-the-art generation methods, significantly boosting the quality and realism of their generated 3D objects, with natural relighting effects and substantially improved geometry.

Boosting 3D Object Generation through PBR Materials

TL;DR

This work proposes a novel approach to boost the quality of generated 3D objects from the perspective of Physics-Based Rendering (PBR) materials, leveraging Stable Diffusion fine-tuned on synthetic data to extract these values and adopting a semi-automatic process to provide room for interactive adjustment.

Abstract

Automatic 3D content creation has gained increasing attention recently, due to its potential in various applications such as video games, film industry, and AR/VR. Recent advancements in diffusion models and multimodal models have notably improved the quality and efficiency of 3D object generation given a single RGB image. However, 3D objects generated even by state-of-the-art methods are still unsatisfactory compared to human-created assets. Considering only textures instead of materials makes these methods encounter challenges in photo-realistic rendering, relighting, and flexible appearance editing. And they also suffer from severe misalignment between geometry and high-frequency texture details. In this work, we propose a novel approach to boost the quality of generated 3D objects from the perspective of Physics-Based Rendering (PBR) materials. By analyzing the components of PBR materials, we choose to consider albedo, roughness, metalness, and bump maps. For albedo and bump maps, we leverage Stable Diffusion fine-tuned on synthetic data to extract these values, with novel usages of these fine-tuned models to obtain 3D consistent albedo UV and bump UV for generated objects. In terms of roughness and metalness maps, we adopt a semi-automatic process to provide room for interactive adjustment, which we believe is more practical. Extensive experiments demonstrate that our model is generally beneficial for various state-of-the-art generation methods, significantly boosting the quality and realism of their generated 3D objects, with natural relighting effects and substantially improved geometry.

Paper Structure

This paper contains 27 sections, 6 equations, 12 figures, 1 table.

Figures (12)

  • Figure 1: Overview of our 3D generation pipeline. Given a single image, we first convert it to an albedo map using our fine-tuned diffusion model. Conditioned on this derived albedo, the base method to be boosted will generate multi-view albedo maps and then fuse them into a 3D mesh and an albedo UV. Afterwards, we leverage a 3D semantic mask to obtain complete metalness and roughness UVs by acquiring the VLMs or 3D artists' manual adjustment. Moreover, an iterative normal refinement is employed to boost the original flawed normals, empowering realistic relighting results.
  • Figure 2: Normal boosting for four different methods. Our iterative normal refinement significantly reduces the original geometry flaws and successfully captures more intricate details aligning with the corresponding images. It's noteworthy that TripoSR inevitably predicts artificial geometry details while our method can avoid this issue.
  • Figure 3: Our refined normal maps lead to improved relighting outcomes under novel lighting environments. (Zoom in for best view)
  • Figure 4: Normal boosting on DreamCraft3D. Our iterative normal refinement also shows its effectiveness on typical 3D objects generated by the prominent method DreamCraft3D.
  • Figure 5: Qualitative comparison of albedo estimation. Regarding albedo estimation from the single image, our fine-tuned diffusion model outperforms two strong baselines on in-the-wild testing cases.
  • ...and 7 more figures