Table of Contents
Fetching ...

DGTR: Distributed Gaussian Turbo-Reconstruction for Sparse-View Vast Scenes

Hao Li, Yuanyuan Gao, Haosong Peng, Chenming Wu, Weicai Ye, Yufeng Zhan, Chen Zhao, Dingwen Zhang, Jingdong Wang, Junwei Han

TL;DR

DGTR is presented, a novel distributed framework for efficient Gaussian reconstruction for sparse-view vast scenes and novel-view synthesis in significantly reduced training times, outperforming existing approaches in both speed and scalability.

Abstract

Novel-view synthesis (NVS) approaches play a critical role in vast scene reconstruction. However, these methods rely heavily on dense image inputs and prolonged training times, making them unsuitable where computational resources are limited. Additionally, few-shot methods often struggle with poor reconstruction quality in vast environments. This paper presents DGTR, a novel distributed framework for efficient Gaussian reconstruction for sparse-view vast scenes. Our approach divides the scene into regions, processed independently by drones with sparse image inputs. Using a feed-forward Gaussian model, we predict high-quality Gaussian primitives, followed by a global alignment algorithm to ensure geometric consistency. Synthetic views and depth priors are incorporated to further enhance training, while a distillation-based model aggregation mechanism enables efficient reconstruction. Our method achieves high-quality large-scale scene reconstruction and novel-view synthesis in significantly reduced training times, outperforming existing approaches in both speed and scalability. We demonstrate the effectiveness of our framework on vast aerial scenes, achieving high-quality results within minutes. Code will released on our [https://3d-aigc.github.io/DGTR].

DGTR: Distributed Gaussian Turbo-Reconstruction for Sparse-View Vast Scenes

TL;DR

DGTR is presented, a novel distributed framework for efficient Gaussian reconstruction for sparse-view vast scenes and novel-view synthesis in significantly reduced training times, outperforming existing approaches in both speed and scalability.

Abstract

Novel-view synthesis (NVS) approaches play a critical role in vast scene reconstruction. However, these methods rely heavily on dense image inputs and prolonged training times, making them unsuitable where computational resources are limited. Additionally, few-shot methods often struggle with poor reconstruction quality in vast environments. This paper presents DGTR, a novel distributed framework for efficient Gaussian reconstruction for sparse-view vast scenes. Our approach divides the scene into regions, processed independently by drones with sparse image inputs. Using a feed-forward Gaussian model, we predict high-quality Gaussian primitives, followed by a global alignment algorithm to ensure geometric consistency. Synthetic views and depth priors are incorporated to further enhance training, while a distillation-based model aggregation mechanism enables efficient reconstruction. Our method achieves high-quality large-scale scene reconstruction and novel-view synthesis in significantly reduced training times, outperforming existing approaches in both speed and scalability. We demonstrate the effectiveness of our framework on vast aerial scenes, achieving high-quality results within minutes. Code will released on our [https://3d-aigc.github.io/DGTR].

Paper Structure

This paper contains 25 sections, 7 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: Our proposed DGTR can rapidly reconstruct sparse-view vast scenes in a distributed manner. Compared with the standard central 3DGS training method, we achieve better visual appearance and geometry accuracy at a faster speed.
  • Figure 2: DGTR (Ours) Overview: Given $M$ individual devices (drones), we aim to perform sparse-view vast scene reconstruction with fast speed in a multi-device collaboration manner. The whole pipeline can be divided into three steps: 1) each device explores a non-overlap region and conducts Gaussian initialization using the off-the-shelf feed-forward Gaussian method and global alignment strategy; 2) each device performs sparse-view scene reconstruction using the initialized Gaussians; 3) The device uploads the well-trained Gaussian model to the central server, the central server performs model aggregation in a distillation manner.
  • Figure 3: Overview of our Model Aggregation Algorithm.
  • Figure 4: Partitions of different scenes. '$\Box$' denote partition areas for each drone. '$\cdot$' denote camera positions. The background is the sparse points produced by global COLMAP.
  • Figure 5: Training curves for both ours and distributed-3DGS on three devices (i.e., #1, #2, #8).
  • ...and 4 more figures