Table of Contents
Fetching ...

StopThePop: Sorted Gaussian Splatting for View-Consistent Real-time Rendering

Lukas Radl, Michael Steiner, Mathias Parger, Alexander Weinrauch, Bernhard Kerbl, Markus Steinberger

TL;DR

This paper presents a novel hierarchical rasterization approach that systematically resorts and culls splats with minimal processing overhead and mitigates the potential for cheating view-dependent effects with popping, ensuring a more authentic representation.

Abstract

Gaussian Splatting has emerged as a prominent model for constructing 3D representations from images across diverse domains. However, the efficiency of the 3D Gaussian Splatting rendering pipeline relies on several simplifications. Notably, reducing Gaussian to 2D splats with a single view-space depth introduces popping and blending artifacts during view rotation. Addressing this issue requires accurate per-pixel depth computation, yet a full per-pixel sort proves excessively costly compared to a global sort operation. In this paper, we present a novel hierarchical rasterization approach that systematically resorts and culls splats with minimal processing overhead. Our software rasterizer effectively eliminates popping artifacts and view inconsistencies, as demonstrated through both quantitative and qualitative measurements. Simultaneously, our method mitigates the potential for cheating view-dependent effects with popping, ensuring a more authentic representation. Despite the elimination of cheating, our approach achieves comparable quantitative results for test images, while increasing the consistency for novel view synthesis in motion. Due to its design, our hierarchical approach is only 4% slower on average than the original Gaussian Splatting. Notably, enforcing consistency enables a reduction in the number of Gaussians by approximately half with nearly identical quality and view-consistency. Consequently, rendering performance is nearly doubled, making our approach 1.6x faster than the original Gaussian Splatting, with a 50% reduction in memory requirements.

StopThePop: Sorted Gaussian Splatting for View-Consistent Real-time Rendering

TL;DR

This paper presents a novel hierarchical rasterization approach that systematically resorts and culls splats with minimal processing overhead and mitigates the potential for cheating view-dependent effects with popping, ensuring a more authentic representation.

Abstract

Gaussian Splatting has emerged as a prominent model for constructing 3D representations from images across diverse domains. However, the efficiency of the 3D Gaussian Splatting rendering pipeline relies on several simplifications. Notably, reducing Gaussian to 2D splats with a single view-space depth introduces popping and blending artifacts during view rotation. Addressing this issue requires accurate per-pixel depth computation, yet a full per-pixel sort proves excessively costly compared to a global sort operation. In this paper, we present a novel hierarchical rasterization approach that systematically resorts and culls splats with minimal processing overhead. Our software rasterizer effectively eliminates popping artifacts and view inconsistencies, as demonstrated through both quantitative and qualitative measurements. Simultaneously, our method mitigates the potential for cheating view-dependent effects with popping, ensuring a more authentic representation. Despite the elimination of cheating, our approach achieves comparable quantitative results for test images, while increasing the consistency for novel view synthesis in motion. Due to its design, our hierarchical approach is only 4% slower on average than the original Gaussian Splatting. Notably, enforcing consistency enables a reduction in the number of Gaussians by approximately half with nearly identical quality and view-consistency. Consequently, rendering performance is nearly doubled, making our approach 1.6x faster than the original Gaussian Splatting, with a 50% reduction in memory requirements.
Paper Structure (47 sections, 16 equations, 27 figures, 13 tables, 1 algorithm)

This paper contains 47 sections, 16 equations, 27 figures, 13 tables, 1 algorithm.

Figures (27)

  • Figure 1: Effect of collapsing 3D Gaussians into 2D splats and 3DGS's depth simplification: (a) Integrating Gaussians along view rays $\mathbf{r}$ requires careful consideration of potentially overlapping 1D Gaussians. (b) Using flattened 2D splats and view-space $z$ as depth (projection of $\mathbf{\mu}$ onto $\mathbf{v}$) puts 2D splats on spherical segments around the camera, inverting the relative positions of the two Gaussians along the example view ray. (c) Camera rotation inverts the order along $\mathbf{r}$, resulting in popping. (d) Camera translation does not alter the distance compared to (b).
  • Figure 2: Our approach to compute $t_{opt}$ avoids popping by placing splats at the point of maximum contribution along the view ray $\mathbf{r}$, creating sort orders independent of camera rotation (red view vector). Note that the shape of $t_{opt}$ is a curved surface and changes with the camera position; cf. Fig. \ref{['fig:depth']}.
  • Figure 4: Comparison of 3DGS with and without per-tile depth calculation. Per-tile depth calculation lowers sorting errors ($\delta_{max}=4.01, \delta_{avg}=0.284$ compared to $\delta_{max}=5.43, \delta_{avg}=0.898$). However, doing this without additional per-pixel sorting leads to artifacts at the tile borders.
  • Figure 5: Number of Gaussians per tile with and without tile-based culling for the Mip-NeRF 360 Garden scene. The average number of Gaussians per tile is reduced by $\sim44\%$.
  • Figure 6: Overview of the detailed steps in our pipeline. We add load balancing, tile culling and per-tile depth evaluation to the first two stages of 3DGS. Our hierarchical rasterizer utilizes three sorted queues, going from $4{\times}4$ tiles over $2{\times}2$ tiles to individual rays. The queues store only id and the tile's $t_{opt}$ per Gaussian, while additional information is re-fetched from global memory on demand, and shared between threads via shuffle operations. Depending on the queue fill levels, we switch between different cooperative group sizes while ensuring the queues remain filled for effective sorting. Our pipeline achieves an overall sorting window of 25-72 elements.
  • ...and 22 more figures