Table of Contents
Fetching ...

Generalized non-exponential Gaussian splatting

Sébastien Speierer, Adrian Jarabo

TL;DR

This work generalizes 3D Gaussian splatting to a wider family of physically-based alpha-blending operators, and uses a quadratic transmittance to define sub-linear, linear, and super-linear versions of 3DGS, which exhibit faster-than-exponential decay.

Abstract

In this work we generalize 3D Gaussian splatting (3DGS) to a wider family of physically-based alpha-blending operators. 3DGS has become the standard de-facto for radiance field rendering and reconstruction, given its flexibility and efficiency. At its core, it is based on alpha-blending sorted semitransparent primitives, which in the limit converges to the classic radiative transfer function with exponential transmittance. Inspired by recent research on non-exponential radiative transfer, we generalize the image formation model of 3DGS to non-exponential regimes. Based on this generalization, we use a quadratic transmittance to define sub-linear, linear, and super-linear versions of 3DGS, which exhibit faster-than-exponential decay. We demonstrate that these new non-exponential variants achieve similar quality than the original 3DGS but significantly reduce the number of overdraws, which result on speed-ups of up to $4\times$ in complex real-world captures, on a ray-tracing-based renderer.

Generalized non-exponential Gaussian splatting

TL;DR

This work generalizes 3D Gaussian splatting to a wider family of physically-based alpha-blending operators, and uses a quadratic transmittance to define sub-linear, linear, and super-linear versions of 3DGS, which exhibit faster-than-exponential decay.

Abstract

In this work we generalize 3D Gaussian splatting (3DGS) to a wider family of physically-based alpha-blending operators. 3DGS has become the standard de-facto for radiance field rendering and reconstruction, given its flexibility and efficiency. At its core, it is based on alpha-blending sorted semitransparent primitives, which in the limit converges to the classic radiative transfer function with exponential transmittance. Inspired by recent research on non-exponential radiative transfer, we generalize the image formation model of 3DGS to non-exponential regimes. Based on this generalization, we use a quadratic transmittance to define sub-linear, linear, and super-linear versions of 3DGS, which exhibit faster-than-exponential decay. We demonstrate that these new non-exponential variants achieve similar quality than the original 3DGS but significantly reduce the number of overdraws, which result on speed-ups of up to in complex real-world captures, on a ray-tracing-based renderer.
Paper Structure (38 sections, 31 equations, 5 figures, 4 tables)

This paper contains 38 sections, 31 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Physical interpretation of splats. Splats can be thought as disk consisting of small random occluding particles. When light passes through an slab, some light is lost when hitting these particles. For uncorrelated splats (a), light occlusion in each splat is an independent stochastic process. For negatively-correlated splats (b), on the other hand, all light that passes through the first slab is occluded by the second one.
  • Figure 2: Transmittance comparison. We render 100 elliptical Gaussians with random rotation with opacity $\aleph = 0.02$, distributed uniformly in depth, for nine different transmittance functions, and compute the inverse transmittance $1 - \bar{T}_{N+1}$ per pixel (top) and the number of overdraws until transmittance saturates (middle, colormap goes from black to yellow). The bottom plots show the theoretical mother transmittance $T(\tau_t)$ of each model (green) compared with the baseline exponential (dashed red), and the discrete transmittance $\bar{T_i}$ at the rendered splats (blue dots). Our predicted discrete transmittance matches the theoretical one, which validates that our generalized model for splatting converges to the continuous generalized RTE. The faster is the decay, the less overdraws are required to saturate, and the more opaque is the reconstructed appearance.
  • Figure 3: Blending comparison. We render 3 Gaussians with different colors and depths (red is closer, then green, and blue at the back) with opacity $\aleph=1$, and (a) without anisotropy and growing in size and (b) anisotropic and with different rotations in the image plane. All splats are parallel to the image plane, and do not intersect. In both (a) and (b), the top row shows the result of blending, the middle row shows the scan-line at the center of the top image (with the black dashed line the sum of the three Gaussians), and the bottom row is the inverse discrete transmittance $1 - \bar{T}_{N+1}$. Faster-than-exponential transmittance (third to ninth columns) show sharper blending, which for extreme cases (power-law with $v=-1$) results in unintuitive blending: This sharper blending behavior is in part due to transparency saturating to one (see middle row), which creates $C_1$ discontinuities on the blended primitives. Only the exponential transmittance does not show this behavior, due to the statistical uncorrelation between each splat, at the cost of more overdraws.
  • Figure 4: Reconstruction results on NeRF synthetic scenes. Each row shows one scene (Chair, Hotdog, Lego and Materials), with the ground-truth reference on the left and the reconstructions using each of the four image formation models to the right. The rightmost column plots PSNR convergence over wall-clock time (top) and iteration count (bottom). Under a fixed time budget, the non-exponential models run significantly more iterations and converge to a higher PSNR, while per-iteration convergence remains comparable across all models.
  • Figure 5: Refinement results on real scenes. Each row shows one scene (Dr Johnson, Playroom, Train and Truck), with the ground-truth reference on the left and the reconstructions using each of the four image formation models to the right. Insets show the per-pixel overdraw count. The non-exponential models maintain comparable visual quality comparable to the exponential baseline while substantially reducing overdraw, which results into $3$--$4\times$ rendering speedups.

Theorems & Definitions (1)

  • proof