ExScene: Free-View 3D Scene Reconstruction with Gaussian Splatting from a Single Image

Tianyi Gong; Boyan Li; Yifei Zhong; Fangxin Wang

ExScene: Free-View 3D Scene Reconstruction with Gaussian Splatting from a Single Image

Tianyi Gong, Boyan Li, Yifei Zhong, Fangxin Wang

TL;DR

ExScene designs a novel multimodal diffusion model to generate a high-fidelity and globally consistent panoramic image, and develops a panoramic depth estimation approach to calculate geometric information from panorama, and combines geometric information with high-fidelity panoramic image to train an initial 3D Gaussian Splatting model.

Abstract

The increasing demand for augmented and virtual reality applications has highlighted the importance of crafting immersive 3D scenes from a simple single-view image. However, due to the partial priors provided by single-view input, existing methods are often limited to reconstruct low-consistency 3D scenes with narrow fields of view from single-view input. These limitations make them less capable of generalizing to reconstruct immersive scenes. To address this problem, we propose ExScene, a two-stage pipeline to reconstruct an immersive 3D scene from any given single-view image. ExScene designs a novel multimodal diffusion model to generate a high-fidelity and globally consistent panoramic image. We then develop a panoramic depth estimation approach to calculate geometric information from panorama, and we combine geometric information with high-fidelity panoramic image to train an initial 3D Gaussian Splatting (3DGS) model. Following this, we introduce a GS refinement technique with 2D stable video diffusion priors. We add camera trajectory consistency and color-geometric priors into the denoising process of diffusion to improve color and spatial consistency across image sequences. These refined sequences are then used to fine-tune the initial 3DGS model, leading to better reconstruction quality. Experimental results demonstrate that our ExScene achieves consistent and immersive scene reconstruction using only single-view input, significantly surpassing state-of-the-art baselines.

ExScene: Free-View 3D Scene Reconstruction with Gaussian Splatting from a Single Image

TL;DR

Abstract

ExScene: Free-View 3D Scene Reconstruction with Gaussian Splatting from a Single Image

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)