Table of Contents
Fetching ...

MicroDreamer: Efficient 3D Generation in $\sim$20 Seconds by Score-based Iterative Reconstruction

Luxi Chen, Zhengyi Wang, Zihan Zhou, Tingting Gao, Hang Su, Jun Zhu, Chongxuan Li

TL;DR

This paper introduces score-based iterative reconstruction (SIR), an efficient and general algorithm mimicking a differentiable 3D reconstruction process to reduce the NFEs and enable optimization in pixel space.

Abstract

Optimization-based approaches, such as score distillation sampling (SDS), show promise in zero-shot 3D generation but suffer from low efficiency, primarily due to the high number of function evaluations (NFEs) required for each sample and the limitation of optimization confined to latent space. This paper introduces score-based iterative reconstruction (SIR), an efficient and general algorithm mimicking a differentiable 3D reconstruction process to reduce the NFEs and enable optimization in pixel space. Given a single set of images sampled from a multi-view score-based diffusion model, SIR repeatedly optimizes 3D parameters, unlike the single-step optimization in SDS. With other improvements in training, we present an efficient approach called MicroDreamer that generally applies to various 3D representations and 3D generation tasks. In particular, MicroDreamer is 5-20 times faster than SDS in generating neural radiance field while retaining a comparable performance and takes about 20 seconds to create meshes from 3D Gaussian splatting on a single A100 GPU, halving the time of the fastest optimization-based baseline DreamGaussian with significantly superior performance compared to the measurement standard deviation. Our code is available at https://github.com/ML-GSAI/MicroDreamer.

MicroDreamer: Efficient 3D Generation in $\sim$20 Seconds by Score-based Iterative Reconstruction

TL;DR

This paper introduces score-based iterative reconstruction (SIR), an efficient and general algorithm mimicking a differentiable 3D reconstruction process to reduce the NFEs and enable optimization in pixel space.

Abstract

Optimization-based approaches, such as score distillation sampling (SDS), show promise in zero-shot 3D generation but suffer from low efficiency, primarily due to the high number of function evaluations (NFEs) required for each sample and the limitation of optimization confined to latent space. This paper introduces score-based iterative reconstruction (SIR), an efficient and general algorithm mimicking a differentiable 3D reconstruction process to reduce the NFEs and enable optimization in pixel space. Given a single set of images sampled from a multi-view score-based diffusion model, SIR repeatedly optimizes 3D parameters, unlike the single-step optimization in SDS. With other improvements in training, we present an efficient approach called MicroDreamer that generally applies to various 3D representations and 3D generation tasks. In particular, MicroDreamer is 5-20 times faster than SDS in generating neural radiance field while retaining a comparable performance and takes about 20 seconds to create meshes from 3D Gaussian splatting on a single A100 GPU, halving the time of the fastest optimization-based baseline DreamGaussian with significantly superior performance compared to the measurement standard deviation. Our code is available at https://github.com/ML-GSAI/MicroDreamer.
Paper Structure (20 sections, 12 equations, 13 figures, 3 tables, 1 algorithm)

This paper contains 20 sections, 12 equations, 13 figures, 3 tables, 1 algorithm.

Figures (13)

  • Figure 1: MicroDreamer surpasses the fastest optimization-based baseline DreamGaussian tang2023dreamgaussian in terms of both efficiency and sample quality. The optimization-based methods are highlighted in red. See Tab. \ref{['tab:quan-comp']} for a comprehensive comparison with more baselines.
  • Figure 2: MicroDreamer can generate a high-quality mesh (as illustrated above) in about 20 seconds on a single A100, built on a multi-view diffusion model without additional 3D data. See supplementary materials for 3D visualization.
  • Figure 3: Overview of SIR. SIR is an optimization-based 3D generation method that marries the strengths of reconstruction and iterative optimization. SIR reutilizes the samples from diffusion multiple times through reconstruction, reducing the total NFEs, enabling optimization in pixel space, and improving efficiency.
  • Figure 4: Time proportion in SDS optimization. We record the time proportions of all components in SDS on 3DGS. The bottleneck lies in the large NFEs and updating in latent space.
  • Figure 5: The hybrid forward process is more efficient than DDIM inversion and generates better samples than noise-adding. We present the final results and sampling time on 20 iterations of SIR for three forward processes. Notably, the Noise-adding process may generate artifacts that contain unexpected elements compared to the input.
  • ...and 8 more figures