Table of Contents
Fetching ...

DreamPolish: Domain Score Distillation With Progressive Geometry Generation

Yean Cheng, Ziqi Cai, Ming Ding, Wendi Zheng, Shiyu Huang, Yuxiao Dong, Jie Tang, Boxin Shi

TL;DR

This work introduces DreamPolish, a text-to-3D generation model that excels in producing refined geometry and high-quality textures and proposes to add a surface polishing stage with only a few training steps, which can effectively refine the artifacts attributed to limited guidance from previous stages and produce 3D objects with more desirable geometry.

Abstract

We introduce DreamPolish, a text-to-3D generation model that excels in producing refined geometry and high-quality textures. In the geometry construction phase, our approach leverages multiple neural representations to enhance the stability of the synthesis process. Instead of relying solely on a view-conditioned diffusion prior in the novel sampled views, which often leads to undesired artifacts in the geometric surface, we incorporate an additional normal estimator to polish the geometry details, conditioned on viewpoints with varying field-of-views. We propose to add a surface polishing stage with only a few training steps, which can effectively refine the artifacts attributed to limited guidance from previous stages and produce 3D objects with more desirable geometry. The key topic of texture generation using pretrained text-to-image models is to find a suitable domain in the vast latent distribution of these models that contains photorealistic and consistent renderings. In the texture generation phase, we introduce a novel score distillation objective, namely domain score distillation (DSD), to guide neural representations toward such a domain. We draw inspiration from the classifier-free guidance (CFG) in textconditioned image generation tasks and show that CFG and variational distribution guidance represent distinct aspects in gradient guidance and are both imperative domains for the enhancement of texture quality. Extensive experiments show our proposed model can produce 3D assets with polished surfaces and photorealistic textures, outperforming existing state-of-the-art methods.

DreamPolish: Domain Score Distillation With Progressive Geometry Generation

TL;DR

This work introduces DreamPolish, a text-to-3D generation model that excels in producing refined geometry and high-quality textures and proposes to add a surface polishing stage with only a few training steps, which can effectively refine the artifacts attributed to limited guidance from previous stages and produce 3D objects with more desirable geometry.

Abstract

We introduce DreamPolish, a text-to-3D generation model that excels in producing refined geometry and high-quality textures. In the geometry construction phase, our approach leverages multiple neural representations to enhance the stability of the synthesis process. Instead of relying solely on a view-conditioned diffusion prior in the novel sampled views, which often leads to undesired artifacts in the geometric surface, we incorporate an additional normal estimator to polish the geometry details, conditioned on viewpoints with varying field-of-views. We propose to add a surface polishing stage with only a few training steps, which can effectively refine the artifacts attributed to limited guidance from previous stages and produce 3D objects with more desirable geometry. The key topic of texture generation using pretrained text-to-image models is to find a suitable domain in the vast latent distribution of these models that contains photorealistic and consistent renderings. In the texture generation phase, we introduce a novel score distillation objective, namely domain score distillation (DSD), to guide neural representations toward such a domain. We draw inspiration from the classifier-free guidance (CFG) in textconditioned image generation tasks and show that CFG and variational distribution guidance represent distinct aspects in gradient guidance and are both imperative domains for the enhancement of texture quality. Extensive experiments show our proposed model can produce 3D assets with polished surfaces and photorealistic textures, outperforming existing state-of-the-art methods.

Paper Structure

This paper contains 22 sections, 11 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: DreamPolish excels in producing 3D models with polished geometry and photorealistic texture. Please refer to the supplemental material for more results and videos.
  • Figure 2: Overview of DreamPolish. Given a text prompt and its corresponding generated image shown in the top left as input, DreamPolish first progressively constructs a fine-grained 3D geometry with coherent and smoothed surface. Then, DreamPolish leverages DSD as the score distillation objective to guide the representation towards a domain with both consistent and photorealistic texture.
  • Figure 3: Illustration on different score distillation strategies. (a): Vanilla SDS poole2023dreamfusion only has guidance direction on zero-mean noise; (b): VSD wang2023prolificdreamer and BSD sun2023dreamcraft3d utilize a variational domain to improve texture quality; (c): our proposed DSD provides guidance directions toward unconditional image domain and variational domain, further improving the stability and photorealism of rendered texture.
  • Figure 4: Qualitative comparisons with baseline methods. Our method produces 3D objects with high-quality geometry and photorealistic textures. Please refer to the supplementary for more results.
  • Figure 5: User study results.
  • ...and 2 more figures