Table of Contents
Fetching ...

A LoD of Gaussians: Unified Training and Rendering for Ultra-Large Scale Reconstruction with External Memory

Felix Windisch, Thomas Köhler, Lukas Radl, Michael Steiner, Dieter Schmalstieg, Markus Steinberger

TL;DR

This work tackles memory bottlenecks in large-scale Gaussian Splatting for novel view synthesis by introducing an out-of-core training and rendering pipeline that operates on a single consumer GPU. It leverages a hierarchical Level-of-Detail (LoD) built from Sequential Point Trees (SPTs) and a novel Hierarchical SPT (HSPT) to enable fast, view-dependent Gaussians selection while streaming data from CPU RAM as needed. A densification strategy expands the LoD during training, and a lightweight GPU caching and view-scheduling system exploits temporal coherence to minimize data transfers. Experiments on MatrixCity datasets show improved rendering quality with fewer artifacts and significantly reduced VRAM usage compared with chunk-based baselines, validating the approach for city-scale scenes. The method enables interactive visualization and reconstruction of ultra-large scenes on commodity hardware, representing a practical step toward scalable radiance-field representations.

Abstract

Gaussian Splatting has emerged as a high-performance technique for novel view synthesis, enabling real-time rendering and high-quality reconstruction of small scenes. However, scaling to larger environments has so far relied on partitioning the scene into chunks -- a strategy that introduces artifacts at chunk boundaries, complicates training across varying scales, and is poorly suited to unstructured scenarios such as city-scale flyovers combined with street-level views. Moreover, rendering remains fundamentally limited by GPU memory, as all visible chunks must reside in VRAM simultaneously. We introduce A LoD of Gaussians, a framework for training and rendering ultra-large-scale Gaussian scenes on a single consumer-grade GPU -- without partitioning. Our method stores the full scene out-of-core (e.g., in CPU memory) and trains a Level-of-Detail (LoD) representation directly, dynamically streaming only the relevant Gaussians. A hybrid data structure combining Gaussian hierarchies with Sequential Point Trees enables efficient, view-dependent LoD selection, while a lightweight caching and view scheduling system exploits temporal coherence to support real-time streaming and rendering. Together, these innovations enable seamless multi-scale reconstruction and interactive visualization of complex scenes -- from broad aerial views to fine-grained ground-level details.

A LoD of Gaussians: Unified Training and Rendering for Ultra-Large Scale Reconstruction with External Memory

TL;DR

This work tackles memory bottlenecks in large-scale Gaussian Splatting for novel view synthesis by introducing an out-of-core training and rendering pipeline that operates on a single consumer GPU. It leverages a hierarchical Level-of-Detail (LoD) built from Sequential Point Trees (SPTs) and a novel Hierarchical SPT (HSPT) to enable fast, view-dependent Gaussians selection while streaming data from CPU RAM as needed. A densification strategy expands the LoD during training, and a lightweight GPU caching and view-scheduling system exploits temporal coherence to minimize data transfers. Experiments on MatrixCity datasets show improved rendering quality with fewer artifacts and significantly reduced VRAM usage compared with chunk-based baselines, validating the approach for city-scale scenes. The method enables interactive visualization and reconstruction of ultra-large scenes on commodity hardware, representing a practical step toward scalable radiance-field representations.

Abstract

Gaussian Splatting has emerged as a high-performance technique for novel view synthesis, enabling real-time rendering and high-quality reconstruction of small scenes. However, scaling to larger environments has so far relied on partitioning the scene into chunks -- a strategy that introduces artifacts at chunk boundaries, complicates training across varying scales, and is poorly suited to unstructured scenarios such as city-scale flyovers combined with street-level views. Moreover, rendering remains fundamentally limited by GPU memory, as all visible chunks must reside in VRAM simultaneously. We introduce A LoD of Gaussians, a framework for training and rendering ultra-large-scale Gaussian scenes on a single consumer-grade GPU -- without partitioning. Our method stores the full scene out-of-core (e.g., in CPU memory) and trains a Level-of-Detail (LoD) representation directly, dynamically streaming only the relevant Gaussians. A hybrid data structure combining Gaussian hierarchies with Sequential Point Trees enables efficient, view-dependent LoD selection, while a lightweight caching and view scheduling system exploits temporal coherence to support real-time streaming and rendering. Together, these innovations enable seamless multi-scale reconstruction and interactive visualization of complex scenes -- from broad aerial views to fine-grained ground-level details.

Paper Structure

This paper contains 56 sections, 7 equations, 23 figures, 5 tables, 3 algorithms.

Figures (23)

  • Figure 1: An SPT and Gaussian hierarchy represent the same 5 Gaussians at varying levels of detail shown with four possible hierarchy cuts in red. Vertical lines in the SPT show the binary search result and horizontal lines the distance cut.
  • Figure 2: Example of densifying and respawning leaf nodes.
  • Figure 3: A Gaussian hierarchy is converted to an HSPT by cutting according to Gaussian volume and converting sufficiently large subtrees to SPTs. The HSPT can then be cut in a 2-step process.
  • Figure 4: Gaussians required for the current training view are assembled from three sources: the upper tree, newly loaded SPT cuts from RAM, and cache hits. After optimization, newly accessed SPTs are added to the GPU cache.
  • Figure 5: Peak memory consumption of CPU and GPU for a training iteration on MC-smaller-city+ with 60 million Gaussians.
  • ...and 18 more figures