Table of Contents
Fetching ...

LOBE-GS: Load-Balanced and Efficient 3D Gaussian Splatting for Large-Scale Scene Reconstruction

Sheng-Hsiang Hung, Ting-Yu Yen, Wei-Fang Sun, Simon See, Shih-Hsuan Hung, Hung-Kuo Chu

TL;DR

LoBE-GS tackles the challenge of scaling 3D Gaussian Splatting to city-scale scenes by addressing load balancing and inefficient coarse-to-fine pipelines. It introduces a depth-aware partitioning strategy and an optimization-based proxy based on the initial visible Gaussians to evenly distribute computational load, coupled with fast camera selection, visibility cropping, and selective densification. These components collectively reduce preprocessing and training time while preserving reconstruction quality, achieving up to 2x end-to-end speedups on large urban datasets. The approach enables practical large-scale 3DGS deployments and suggests future extensions with higher levels of detail and new representations.

Abstract

3D Gaussian Splatting (3DGS) has established itself as an efficient representation for real-time, high-fidelity 3D scene reconstruction. However, scaling 3DGS to large and unbounded scenes such as city blocks remains difficult. Existing divide-and-conquer methods alleviate memory pressure by partitioning the scene into blocks, but introduce new bottlenecks: (i) partitions suffer from severe load imbalance since uniform or heuristic splits do not reflect actual computational demands, and (ii) coarse-to-fine pipelines fail to exploit the coarse stage efficiently, often reloading the entire model and incurring high overhead. In this work, we introduce LoBE-GS, a novel Load-Balanced and Efficient 3D Gaussian Splatting framework, that re-engineers the large-scale 3DGS pipeline. LoBE-GS introduces a depth-aware partitioning method that reduces preprocessing from hours to minutes, an optimization-based strategy that balances visible Gaussians -- a strong proxy for computational load -- across blocks, and two lightweight techniques, visibility cropping and selective densification, to further reduce training cost. Evaluations on large-scale urban and outdoor datasets show that LoBE-GS consistently achieves up to $2\times$ faster end-to-end training time than state-of-the-art baselines, while maintaining reconstruction quality and enabling scalability to scenes infeasible with vanilla 3DGS.

LOBE-GS: Load-Balanced and Efficient 3D Gaussian Splatting for Large-Scale Scene Reconstruction

TL;DR

LoBE-GS tackles the challenge of scaling 3D Gaussian Splatting to city-scale scenes by addressing load balancing and inefficient coarse-to-fine pipelines. It introduces a depth-aware partitioning strategy and an optimization-based proxy based on the initial visible Gaussians to evenly distribute computational load, coupled with fast camera selection, visibility cropping, and selective densification. These components collectively reduce preprocessing and training time while preserving reconstruction quality, achieving up to 2x end-to-end speedups on large urban datasets. The approach enables practical large-scale 3DGS deployments and suggests future extensions with higher levels of detail and new representations.

Abstract

3D Gaussian Splatting (3DGS) has established itself as an efficient representation for real-time, high-fidelity 3D scene reconstruction. However, scaling 3DGS to large and unbounded scenes such as city blocks remains difficult. Existing divide-and-conquer methods alleviate memory pressure by partitioning the scene into blocks, but introduce new bottlenecks: (i) partitions suffer from severe load imbalance since uniform or heuristic splits do not reflect actual computational demands, and (ii) coarse-to-fine pipelines fail to exploit the coarse stage efficiently, often reloading the entire model and incurring high overhead. In this work, we introduce LoBE-GS, a novel Load-Balanced and Efficient 3D Gaussian Splatting framework, that re-engineers the large-scale 3DGS pipeline. LoBE-GS introduces a depth-aware partitioning method that reduces preprocessing from hours to minutes, an optimization-based strategy that balances visible Gaussians -- a strong proxy for computational load -- across blocks, and two lightweight techniques, visibility cropping and selective densification, to further reduce training cost. Evaluations on large-scale urban and outdoor datasets show that LoBE-GS consistently achieves up to faster end-to-end training time than state-of-the-art baselines, while maintaining reconstruction quality and enabling scalability to scenes infeasible with vanilla 3DGS.

Paper Structure

This paper contains 27 sections, 3 equations, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Illustration of per-block training time under different partitioning strategies. (Left) The uniform area partitioning in CityGS. (Right) The load-balanced partitioning in LoBE-GS.
  • Figure 2: Correlation between per-block training time and block-level statistics under CityGS's partitioning. (a) Plots of block area ${A}^{(b)}$ and camera count ${C}^{(b)}$. (b) Plots of Gaussian-based measures (${G}_{\mathrm{blk}}^{(b)}$, ${G}_{\mathrm{vis}}^{(b)}$, ${G}_{\mathrm{avg\_vis}}^{(b)}$). ${G}_{\mathrm{vis}}^{(b)}$ yields the strongest and most consistent correlation across datasets.
  • Figure 3: Overview of our framework. Our approach begins with training a coarse 3DGS model. Using our load balance–aware data partition, we optimize the grid cuts to achieve a more balanced division of the scene. We then apply visibility cropping and selective densification before and during the parallel fine-training stage, enabling faster and more efficient training. Finally, we prune regions outside each block and merge the results into a unified, high-quality model.
  • Figure A.1: Comparison of load balance and partitioning between CityGS (Left) and LoBE-GS (Right) across five datasets: Building, Rubble, Residence, Sci-Art, and MatrixCity-Aerial.
  • Figure A.2: Correlation between per-block training time and block-level statistics under CityGS's partitioning with both visibility cropping and selective densification enabled. (a) Plots camera count ${C}^{(b)}$. (b) Plots of Gaussian-based measures (${G}_{\mathrm{blk}}^{(b)}$, ${G}_{\mathrm{vis}}^{(b)}$, ${G}_{\mathrm{avg\_vis}}^{(b)}$). ${G}_{\mathrm{vis}}^{(b)}$ yields the strongest and most consistent correlation across datasets even when selective densification is enabled.