Gaussian Splatting with NeRF-based Color and Opacity
Dawid Malarz, Weronika Smolak, Jacek Tabor, Sławomir Tadeja, Przemysław Spurek
TL;DR
VDGS tackles the slow training/inference of NeRFs and the conditioning challenge of Gaussian Splatting by proposing a hybrid that represents geometry with 3D Gaussians and uses a lightweight NeRF-based network to modulate view-dependent color and opacity. The method parameterizes Gaussians with trainable means $\mathrm{m}_i$, covariances $\Sigma_i$, and opacities $\sigma_i$, and learns a function $\mathcal{F}_{VDGS}(\mathrm{m}_i,\\mathbf{d};\\Theta)$ to produce opacity updates $\\Delta \sigma(\\mathbf{d})$, integrating into the pixel color via a differentiable compositing equation $C(p)=\sum_{i\in N} c_i \alpha_i \prod_{j=1}^{i-1}(1-\\alpha_j)$ with $\\alpha_i=1-\\exp\left(\sigma_i \cdot \mathcal{F}_{VDGS}(\mathrm{m}_i,\\mathbf{d};\\Theta) \cdot \delta_i\right)$. The results show VDGS achieves competitive or superior view synthesis quality across standard datasets while maintaining GS-like training/inference speed, and it improves handling of shadows, reflections, and transparency. This work demonstrates that conditioning Gaussians with a NeRF-based network can combine the best of both paradigms for flexible, real-time capable neural rendering.
Abstract
Neural Radiance Fields (NeRFs) have demonstrated the remarkable potential of neural networks to capture the intricacies of 3D objects. By encoding the shape and color information within neural network weights, NeRFs excel at producing strikingly sharp novel views of 3D objects. Recently, numerous generalizations of NeRFs utilizing generative models have emerged, expanding its versatility. In contrast, Gaussian Splatting (GS) offers a similar render quality with faster training and inference as it does not need neural networks to work. It encodes information about the 3D objects in the set of Gaussian distributions that can be rendered in 3D similarly to classical meshes. Unfortunately, GS are difficult to condition since they usually require circa hundred thousand Gaussian components. To mitigate the caveats of both models, we propose a hybrid model Viewing Direction Gaussian Splatting (VDGS) that uses GS representation of the 3D object's shape and NeRF-based encoding of color and opacity. Our model uses Gaussian distributions with trainable positions (i.e. means of Gaussian), shape (i.e. covariance of Gaussian), color and opacity, and a neural network that takes Gaussian parameters and viewing direction to produce changes in the said color and opacity. As a result, our model better describes shadows, light reflections, and the transparency of 3D objects without adding additional texture and light components.
