DreamComposer: Controllable 3D Object Generation via Multi-View Conditions

Yunhan Yang; Yukun Huang; Xiaoyang Wu; Yuan-Chen Guo; Song-Hai Zhang; Hengshuang Zhao; Tong He; Xihui Liu

DreamComposer: Controllable 3D Object Generation via Multi-View Conditions

Yunhan Yang, Yukun Huang, Xiaoyang Wu, Yuan-Chen Guo, Song-Hai Zhang, Hengshuang Zhao, Tong He, Xihui Liu

TL;DR

Experiments show that DreamComposer is compatible with state-of-the-art diffusion models for zero-shot novel view synthesis, further enhancing them to generate high-fidelity novel view images with multi-view conditions, ready for controllable 3D object reconstruction and various other applications.

Abstract

Utilizing pre-trained 2D large-scale generative models, recent works are capable of generating high-quality novel views from a single in-the-wild image. However, due to the lack of information from multiple views, these works encounter difficulties in generating controllable novel views. In this paper, we present DreamComposer, a flexible and scalable framework that can enhance existing view-aware diffusion models by injecting multi-view conditions. Specifically, DreamComposer first uses a view-aware 3D lifting module to obtain 3D representations of an object from multiple views. Then, it renders the latent features of the target view from 3D representations with the multi-view feature fusion module. Finally the target view features extracted from multi-view inputs are injected into a pre-trained diffusion model. Experiments show that DreamComposer is compatible with state-of-the-art diffusion models for zero-shot novel view synthesis, further enhancing them to generate high-fidelity novel view images with multi-view conditions, ready for controllable 3D object reconstruction and various other applications.

DreamComposer: Controllable 3D Object Generation via Multi-View Conditions

TL;DR

Abstract

Paper Structure (23 sections, 3 equations, 16 figures, 9 tables)

This paper contains 23 sections, 3 equations, 16 figures, 9 tables.

Introduction
Related Work
Method
Target-Aware 3D Lifting
Multi-View Feature Fusion
Target-View Feature Injection
Training and Inference
Experiments
Datasets and Evaluation Metrics
Implementation Details
Multi-view Input
Single-view Input
Applications
Ablation Analysis
Conclusion and Discussions
...and 8 more sections

Figures (16)

Figure 1: DreamComposer is able to generate controllable novel views and 3D objects via injecting multi-view conditions. We incorporate the method into the pipelines of Zero-1-to-3 liu2023zero1to3 and SyncDreamer (SyncD) liu2023syncdreamer to enhance the control ability of those models.
Figure 2: An overview pipeline of DreamComposer. Given multiple input images from different views, DreamComposer extracts their 2D latent features and uses a 3D lifting module to produce tri-plane 3D representations. Then, the multi-view condition rendered from 3D representations is injected into the pre-trained diffusion model to provide target-view auxiliary information.
Figure 3: Different numbers of ground-truth inputs. Our model is capable of handling a variety of ground-truth input quantities.
Figure 4: Qualitative comparisons with Zero-1-to-3 liu2023zero1to3 in controllable novel view synthesis. DC-Zero-1-to-3 effectively generates more controllable images from novel viewpoints by utilizing conditions from multi-view images.
Figure 5: Qualitative comparison with SyncDreamer (SyncD) liu2023syncdreamer in controllable novel view synthesis and 3D reconstruction. The image in $\square$ is the main input, and the other image in $\square$ is the conditional input generated from Zero-1-to-3 liu2023zero1to3. With more information in multi-view images, DC-SyncDreamer is able to generate more accurate back textures and more controllable 3D shapes.
...and 11 more figures

DreamComposer: Controllable 3D Object Generation via Multi-View Conditions

TL;DR

Abstract

DreamComposer: Controllable 3D Object Generation via Multi-View Conditions

Authors

TL;DR

Abstract

Table of Contents

Figures (16)