GS-Cache: A GS-Cache Inference Framework for Large-scale Gaussian Splatting Models

Miao Tao; Yuanzhen Zhou; Haoran Xu; Zeyu He; Zhenyu Yang; Yuchang Zhang; Zhongling Su; Linning Xu; Zhenxiang Ma; Rong Fu; Hengjie Li; Xingcheng Zhang; Jidong Zhai

GS-Cache: A GS-Cache Inference Framework for Large-scale Gaussian Splatting Models

Miao Tao, Yuanzhen Zhou, Haoran Xu, Zeyu He, Zhenyu Yang, Yuchang Zhang, Zhongling Su, Linning Xu, Zhenxiang Ma, Rong Fu, Hengjie Li, Xingcheng Zhang, Jidong Zhai

TL;DR

GS-Cache tackles the challenge of real-time, high-fidelity rendering of large-scale Gaussian Splatting scenes on consumer hardware by fusing a cache-centric, de-redundancy rendering pipeline with an elastic multi-GPU scheduler. It introduces dynamic cache depth and binocular stereo de-redundancy to reuse Gaussian parameters across frames, complemented by dedicated CUDA kernels that accelerate the derivation and rasterization stages. Across city- and street-scale experiments, GS-Cache delivers up to 5.35x speedups, reduces latency, and achieves binocular 2K rendering at over 120 FPS with high visual quality, using consumer-grade GPUs. The framework thereby enables scalable, real-time neural rendering suitable for immersive VR environments, balancing performance, memory, and image fidelity through system-level optimizations.

Abstract

Rendering large-scale 3D Gaussian Splatting (3DGS) model faces significant challenges in achieving real-time, high-fidelity performance on consumer-grade devices. Fully realizing the potential of 3DGS in applications such as virtual reality (VR) requires addressing critical system-level challenges to support real-time, immersive experiences. We propose GS-Cache, an end-to-end framework that seamlessly integrates 3DGS's advanced representation with a highly optimized rendering system. GS-Cache introduces a cache-centric pipeline to eliminate redundant computations, an efficiency-aware scheduler for elastic multi-GPU rendering, and optimized CUDA kernels to overcome computational bottlenecks. This synergy between 3DGS and system design enables GS-Cache to achieve up to 5.35x performance improvement, 35% latency reduction, and 42% lower GPU memory usage, supporting 2K binocular rendering at over 120 FPS with high visual quality. By bridging the gap between 3DGS's representation power and the demands of VR systems, GS-Cache establishes a scalable and efficient framework for real-time neural rendering in immersive environments.

GS-Cache: A GS-Cache Inference Framework for Large-scale Gaussian Splatting Models

TL;DR

Abstract

GS-Cache: A GS-Cache Inference Framework for Large-scale Gaussian Splatting Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (13)