Table of Contents
Fetching ...

DiskChunGS: Large-Scale 3D Gaussian SLAM Through Chunk-Based Memory Management

Casimir Feldmann, Maximum Wilder-Smith, Vaishakh Patil, Michael Oechsle, Michael Niemeyer, Keisuke Tateno, Marco Hutter

TL;DR

DiskChunGS tackles the memory bottleneck of large-scale 3D Gaussian SLAM by introducing an out-of-core, chunk-based memory architecture that streams active scene regions from disk into VRAM. The approach integrates with ORB-SLAM3 for pose estimation and loop closure, employing frustum-based chunk loading, LRU-based eviction, and depth-supervised, differentiable rendering to maintain high visual fidelity at scale. Comprehensive experiments demonstrate the method's ability to complete all KITTI sequences without memory failures while delivering superior perceptual quality and efficiency, even on edge hardware like the Jetson Orin. The work presents a practical, production-ready solution that shifts the scalability challenge from hardware constraints to algorithmic design, enabling multi-kilometer photorealistic reconstruction in real-world robotics.

Abstract

Recent advances in 3D Gaussian Splatting (3DGS) have demonstrated impressive results for novel view synthesis with real-time rendering capabilities. However, integrating 3DGS with SLAM systems faces a fundamental scalability limitation: methods are constrained by GPU memory capacity, restricting reconstruction to small-scale environments. We present DiskChunGS, a scalable 3DGS SLAM system that overcomes this bottleneck through an out-of-core approach that partitions scenes into spatial chunks and maintains only active regions in GPU memory while storing inactive areas on disk. Our architecture integrates seamlessly with existing SLAM frameworks for pose estimation and loop closure, enabling globally consistent reconstruction at scale. We validate DiskChunGS on indoor scenes (Replica, TUM-RGBD), urban driving scenarios (KITTI), and resource-constrained Nvidia Jetson platforms. Our method uniquely completes all 11 KITTI sequences without memory failures while achieving superior visual quality, demonstrating that algorithmic innovation can overcome the memory constraints that have limited previous 3DGS SLAM methods.

DiskChunGS: Large-Scale 3D Gaussian SLAM Through Chunk-Based Memory Management

TL;DR

DiskChunGS tackles the memory bottleneck of large-scale 3D Gaussian SLAM by introducing an out-of-core, chunk-based memory architecture that streams active scene regions from disk into VRAM. The approach integrates with ORB-SLAM3 for pose estimation and loop closure, employing frustum-based chunk loading, LRU-based eviction, and depth-supervised, differentiable rendering to maintain high visual fidelity at scale. Comprehensive experiments demonstrate the method's ability to complete all KITTI sequences without memory failures while delivering superior perceptual quality and efficiency, even on edge hardware like the Jetson Orin. The work presents a practical, production-ready solution that shifts the scalability challenge from hardware constraints to algorithmic design, enabling multi-kilometer photorealistic reconstruction in real-world robotics.

Abstract

Recent advances in 3D Gaussian Splatting (3DGS) have demonstrated impressive results for novel view synthesis with real-time rendering capabilities. However, integrating 3DGS with SLAM systems faces a fundamental scalability limitation: methods are constrained by GPU memory capacity, restricting reconstruction to small-scale environments. We present DiskChunGS, a scalable 3DGS SLAM system that overcomes this bottleneck through an out-of-core approach that partitions scenes into spatial chunks and maintains only active regions in GPU memory while storing inactive areas on disk. Our architecture integrates seamlessly with existing SLAM frameworks for pose estimation and loop closure, enabling globally consistent reconstruction at scale. We validate DiskChunGS on indoor scenes (Replica, TUM-RGBD), urban driving scenarios (KITTI), and resource-constrained Nvidia Jetson platforms. Our method uniquely completes all 11 KITTI sequences without memory failures while achieving superior visual quality, demonstrating that algorithmic innovation can overcome the memory constraints that have limited previous 3DGS SLAM methods.

Paper Structure

This paper contains 6 sections, 8 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Pareto curves for KITTI Geiger2012CVPR scenes. With more iterations and, as a consequence, more processing time, 3dgs slam methods can optimize for longer, achieving higher reconstruction quality. Our method achieves superior visual quality in less time than competing methods across all three scenes.
  • Figure 2: Overview of DiskChunGS. For each slam keyframe, we estimate depth and perform direct primitive placement based on image content analysis instead of iterative densification. For optimization of a keyframe, frustum culling is performed to identify visible chunks, which are loaded from disk into vram. On the other hand, old chunks are evicted from vram to disk to free up memory. The visible subset of Gaussians in vram is then rasterized, and image/depth losses are calculated.
  • Figure 3: Qualitative results on the KITTI Geiger2012CVPR dataset. Reconstruction results on three scenes by all methods. On-The-Fly suffers from tracking drift and lacks loop closure. GigaSLAM's neural approach fails without expensive post-processing. CaRtGS shows floating artifacts from missing depth supervision. Our method achieves superior quality through robust tracking, depth-supervised Gaussian placement, and efficient chunk-based optimization.
  • Figure 4: Zoomed qualitative results on the Replica straub2019replicadatasetdigitalreplica dataset (office0).
  • Figure 5: Active Gaussians and Keyframes vs vram usage. Our Gaussian and Keyframe disk-based saving and loading system keeps memory usage steady as scene size increases.