Table of Contents
Fetching ...

VR-Splatting: Foveated Radiance Field Rendering via 3D Gaussian Splatting and Neural Points

Linus Franke, Laura Fink, Marc Stamminger

TL;DR

This work tackles real-time radiance-field rendering for VR under strict performance budgets by introducing VR-Splatting, a hybrid foveated approach that uses lightweight Gaussian splats in the periphery and neural-point rendering in the fovea. The method is trained end-to-end to produce a coherent, high-fidelity image while maintaining VR framerate targets, leveraging eye-tracking to drive a foveal crop and image-based CNN refinement. Key contributions include a novel combination strategy, a per-pixel occlusion-aware foveal rendering pipeline, and a loss design that jointly optimizes peripheral quality and foveal detail without artifacts like popping. Experiments on real-world datasets show real-time performance at full VR resolution with superior perceptual quality (LPIPS) and a strong user preference in a 2AFC study, highlighting practical impact for immersive VR experiences.

Abstract

Recent advances in novel view synthesis have demonstrated impressive results in fast photorealistic scene rendering through differentiable point rendering, either via Gaussian Splatting (3DGS) [Kerbl and Kopanas et al. 2023] or neural point rendering [Aliev et al. 2020]. Unfortunately, these directions require either a large number of small Gaussians or expensive per-pixel post-processing for reconstructing fine details, which negatively impacts rendering performance. To meet the high performance demands of virtual reality (VR) systems, primitive or pixel counts therefore must be kept low, affecting visual quality. In this paper, we propose a novel hybrid approach based on foveated rendering as a promising solution that combines the strengths of both point rendering directions regarding performance sweet spots. Analyzing the compatibility with the human visual system, we find that using a low-detailed, few primitive smooth Gaussian representation for the periphery is cheap to compute and meets the perceptual demands of peripheral vision. For the fovea only, we use neural points with a convolutional neural network for the small pixel footprint, which provides sharp, detailed output within the rendering budget. This combination also allows for synergistic method accelerations with point occlusion culling and reducing the demands on the neural network. Our evaluation confirms that our approach increases sharpness and details compared to a standard VR-ready 3DGS configuration, and participants of a user study overwhelmingly preferred our method. Our system meets the necessary performance requirements for real-time VR interactions, ultimately enhancing the user's immersive experience. The project page can be found at: https://lfranke.github.io/vr_splatting

VR-Splatting: Foveated Radiance Field Rendering via 3D Gaussian Splatting and Neural Points

TL;DR

This work tackles real-time radiance-field rendering for VR under strict performance budgets by introducing VR-Splatting, a hybrid foveated approach that uses lightweight Gaussian splats in the periphery and neural-point rendering in the fovea. The method is trained end-to-end to produce a coherent, high-fidelity image while maintaining VR framerate targets, leveraging eye-tracking to drive a foveal crop and image-based CNN refinement. Key contributions include a novel combination strategy, a per-pixel occlusion-aware foveal rendering pipeline, and a loss design that jointly optimizes peripheral quality and foveal detail without artifacts like popping. Experiments on real-world datasets show real-time performance at full VR resolution with superior perceptual quality (LPIPS) and a strong user preference in a 2AFC study, highlighting practical impact for immersive VR experiences.

Abstract

Recent advances in novel view synthesis have demonstrated impressive results in fast photorealistic scene rendering through differentiable point rendering, either via Gaussian Splatting (3DGS) [Kerbl and Kopanas et al. 2023] or neural point rendering [Aliev et al. 2020]. Unfortunately, these directions require either a large number of small Gaussians or expensive per-pixel post-processing for reconstructing fine details, which negatively impacts rendering performance. To meet the high performance demands of virtual reality (VR) systems, primitive or pixel counts therefore must be kept low, affecting visual quality. In this paper, we propose a novel hybrid approach based on foveated rendering as a promising solution that combines the strengths of both point rendering directions regarding performance sweet spots. Analyzing the compatibility with the human visual system, we find that using a low-detailed, few primitive smooth Gaussian representation for the periphery is cheap to compute and meets the perceptual demands of peripheral vision. For the fovea only, we use neural points with a convolutional neural network for the small pixel footprint, which provides sharp, detailed output within the rendering budget. This combination also allows for synergistic method accelerations with point occlusion culling and reducing the demands on the neural network. Our evaluation confirms that our approach increases sharpness and details compared to a standard VR-ready 3DGS configuration, and participants of a user study overwhelmingly preferred our method. Our system meets the necessary performance requirements for real-time VR interactions, ultimately enhancing the user's immersive experience. The project page can be found at: https://lfranke.github.io/vr_splatting

Paper Structure

This paper contains 36 sections, 7 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Rendering efficiency of 3D Gaussian Splatting kerbl3Dgaussians and TRIPS franke2024trips. Left: Gaussian Splatting (GS) performance decreases drastically with increased number of primitives. To fit within VR limits, Gaussian counts need to be kept low, impacting quality. Right: In contrast, TRIPS performance scales mainly with resolution, allowing efficient rendering of high quality small crops.
  • Figure 2: Equal render time comparison for components of our method. Fovea-sized crops highlighted. For fine details, TRIPS often shows crisper results.
  • Figure 3: Gaussians in 3DGS "popping" in and out during small camera rotations, resulting in black spots suddenly appearing and disappearing. Popping is especially noticeable in VR, as revealed in our pilot study.
  • Figure 4: Our Pipeline. A smooth color image and approximate depth map are rendered from a limited set of 3D Gaussians with a temporally stable sorting active (see Section \ref{['sec:gaussian_periphery']}). Afterwards the eye tracking system is queried to construct a subfrustum via an adapted projection matrix covering only the foveal region. We project a separate neural point cloud with the adapted matrix. Points occluded by the denser Gaussian splats, are culled against the approximate depth maps and the result is processed by a small CNN (see Section \ref{['sec:fovea_neural_points']}). Eventually, we blend the peripheral color output and the foveal region using an egde-aware mask (see Section \ref{['sec:combination']}).
  • Figure 5: Our combination mask (b) in effect (zoomed in). It is used to adaptively merge Gaussian output with neural point rendering.
  • ...and 3 more figures