Table of Contents
Fetching ...

BlockGaussian: Efficient Large-Scale Scene Novel View Synthesis via Adaptive Block-Based Gaussian Splatting

Yongchang Wu, Zipeng Qi, Zhenwei Shi, Zhengxia Zou

TL;DR

BlockGaussian tackles the challenge of efficient, high-fidelity novel view synthesis for city-scale scenes by introducing a content-aware, adaptive partitioning strategy that balances block workloads. It then performs independent block optimization with auxiliary points to address supervision mismatch and applies a pseudo-view geometry constraint to supervise airspace during fusion, enabling seamless block merging. The method achieves state-of-the-art results on multiple large-scale benchmarks, delivering a 5× speedup in optimization and an average PSNR improvement of 1.21 dB, while running on a single 24GB VRAM device. This approach significantly reduces computational demands and enhances rendering quality, paving the way for scalable, interactive large-scale scene reconstruction with Gaussian-based representations.

Abstract

The recent advancements in 3D Gaussian Splatting (3DGS) have demonstrated remarkable potential in novel view synthesis tasks. The divide-and-conquer paradigm has enabled large-scale scene reconstruction, but significant challenges remain in scene partitioning, optimization, and merging processes. This paper introduces BlockGaussian, a novel framework incorporating a content-aware scene partition strategy and visibility-aware block optimization to achieve efficient and high-quality large-scale scene reconstruction. Specifically, our approach considers the content-complexity variation across different regions and balances computational load during scene partitioning, enabling efficient scene reconstruction. To tackle the supervision mismatch issue during independent block optimization, we introduce auxiliary points during individual block optimization to align the ground-truth supervision, which enhances the reconstruction quality. Furthermore, we propose a pseudo-view geometry constraint that effectively mitigates rendering degradation caused by airspace floaters during block merging. Extensive experiments on large-scale scenes demonstrate that our approach achieves state-of-the-art performance in both reconstruction efficiency and rendering quality, with a 5x speedup in optimization and an average PSNR improvement of 1.21 dB on multiple benchmarks. Notably, BlockGaussian significantly reduces computational requirements, enabling large-scale scene reconstruction on a single 24GB VRAM device. The project page is available at https://github.com/SunshineWYC/BlockGaussian

BlockGaussian: Efficient Large-Scale Scene Novel View Synthesis via Adaptive Block-Based Gaussian Splatting

TL;DR

BlockGaussian tackles the challenge of efficient, high-fidelity novel view synthesis for city-scale scenes by introducing a content-aware, adaptive partitioning strategy that balances block workloads. It then performs independent block optimization with auxiliary points to address supervision mismatch and applies a pseudo-view geometry constraint to supervise airspace during fusion, enabling seamless block merging. The method achieves state-of-the-art results on multiple large-scale benchmarks, delivering a 5× speedup in optimization and an average PSNR improvement of 1.21 dB, while running on a single 24GB VRAM device. This approach significantly reduces computational demands and enhances rendering quality, paving the way for scalable, interactive large-scale scene reconstruction with Gaussian-based representations.

Abstract

The recent advancements in 3D Gaussian Splatting (3DGS) have demonstrated remarkable potential in novel view synthesis tasks. The divide-and-conquer paradigm has enabled large-scale scene reconstruction, but significant challenges remain in scene partitioning, optimization, and merging processes. This paper introduces BlockGaussian, a novel framework incorporating a content-aware scene partition strategy and visibility-aware block optimization to achieve efficient and high-quality large-scale scene reconstruction. Specifically, our approach considers the content-complexity variation across different regions and balances computational load during scene partitioning, enabling efficient scene reconstruction. To tackle the supervision mismatch issue during independent block optimization, we introduce auxiliary points during individual block optimization to align the ground-truth supervision, which enhances the reconstruction quality. Furthermore, we propose a pseudo-view geometry constraint that effectively mitigates rendering degradation caused by airspace floaters during block merging. Extensive experiments on large-scale scenes demonstrate that our approach achieves state-of-the-art performance in both reconstruction efficiency and rendering quality, with a 5x speedup in optimization and an average PSNR improvement of 1.21 dB on multiple benchmarks. Notably, BlockGaussian significantly reduces computational requirements, enabling large-scale scene reconstruction on a single 24GB VRAM device. The project page is available at https://github.com/SunshineWYC/BlockGaussian

Paper Structure

This paper contains 21 sections, 13 equations, 10 figures, 7 tables, 1 algorithm.

Figures (10)

  • Figure 1: BlockGaussian reconstructs city-scale scenes from massive multi-view images and enables high-quality novel view synthesis from arbitrary viewpoints, as illustrated in the surrounding images. Compared to existing methods, our approach reduces reconstruction time from hours to minutes while achieving superior rendering quality in most scenes.
  • Figure 2: Existing challenges in large-scale scene novel view synthesis task under divide-and-conquer paradigm. a) Imbalanced reconstruction complexity across blocks: The intensity of content in different scene regions exhibits significant differences. Areas with dense content require finer subdivision granularity to ensure reconstruction fidelity, while sparser-content regions benefit from coarser partitioning to enhance computational efficiency. b) Supervision mismatch in block-wise optimization: The content of a training view may be divided into multiple blocks after scene partitioning. Due to visibility constraints, the entire training view image does not match the ideal supervision when optimizing the individual block. c) Quality degradation in fusion results: Floater in airspace is an important reason for the quality degradation of fusion results. Since each block is optimized individually, these floaters fit well in the training perspective but degrade the quality of the synthesized novel views, especially in the boundary region.
  • Figure 3: Overview of our proposed method. We first divide the entire scene and allocates viewpoints with Content-Aware Scene Partition, which jointly considering the complexity of scene content and the computational load distribution across blocks. Subsequently, we optimize each block independently, which is executable either sequentially on a single GPU or in parallel across multiple GPUs. During block optimization, we introduce auxiliary point clouds (aux pts) to address supervision mismatch issues. Pseudo-View Geometry Constraint is conducted to supervise airspace regions and mitigate floater artifacts. Finally, the optimized results from all blocks are integrated to construct a comprehensive Gaussian Representation of the entire scene, enabling interactive novel view synthesis.
  • Figure 4: Illustration of the Pseudo-View Geometry Constraint. Typically, artifacts in the airspace can fit RGB images well with inaccurate depth. To address this, we impose constraints on depth to suppress floaters generated in the airspace. For each training view, we generate a pseudo-view by applying slight perturbations to the camera pose. Then we warp the pseudo-view rendered image $I_{\text{pse}}^{\text{r}}$ utilizing rendered depth map $D_{\text{pse}}^{\text{r}}$ to train-view $I_{\text{warp}}^{\text{r}}$. The loss calculated between $I_{\text{warp}}^{\text{r}}$ and train-view ground-truth $I_{\text{ref}}^{\text{gt}}$ provides depth supervision.
  • Figure 5: Qualitative Results on Mill19 and UrbanScene3D Datasets.
  • ...and 5 more figures