Table of Contents
Fetching ...

SpecGaussian with Latent Features: A High-quality Modeling of the View-dependent Appearance for 3D Gaussian Splatting

Zhiru Wang, Shiyun Xie, Chengwei Pan, Guoping Wang

TL;DR

Lantent-SpecGS, an approach that utilizes a universal latent neural descriptor within each 3D Gaussian, enables a more effective representation of 3D feature fields, including appearance and geometry and obtains competitive performance in novel view synthesis.

Abstract

Recently, the 3D Gaussian Splatting (3D-GS) method has achieved great success in novel view synthesis, providing real-time rendering while ensuring high-quality rendering results. However, this method faces challenges in modeling specular reflections and handling anisotropic appearance components, especially in dealing with view-dependent color under complex lighting conditions. Additionally, 3D-GS uses spherical harmonic to learn the color representation, which has limited ability to represent complex scenes. To overcome these challenges, we introduce Lantent-SpecGS, an approach that utilizes a universal latent neural descriptor within each 3D Gaussian. This enables a more effective representation of 3D feature fields, including appearance and geometry. Moreover, two parallel CNNs are designed to decoder the splatting feature maps into diffuse color and specular color separately. A mask that depends on the viewpoint is learned to merge these two colors, resulting in the final rendered image. Experimental results demonstrate that our method obtains competitive performance in novel view synthesis and extends the ability of 3D-GS to handle intricate scenarios with specular reflections.

SpecGaussian with Latent Features: A High-quality Modeling of the View-dependent Appearance for 3D Gaussian Splatting

TL;DR

Lantent-SpecGS, an approach that utilizes a universal latent neural descriptor within each 3D Gaussian, enables a more effective representation of 3D feature fields, including appearance and geometry and obtains competitive performance in novel view synthesis.

Abstract

Recently, the 3D Gaussian Splatting (3D-GS) method has achieved great success in novel view synthesis, providing real-time rendering while ensuring high-quality rendering results. However, this method faces challenges in modeling specular reflections and handling anisotropic appearance components, especially in dealing with view-dependent color under complex lighting conditions. Additionally, 3D-GS uses spherical harmonic to learn the color representation, which has limited ability to represent complex scenes. To overcome these challenges, we introduce Lantent-SpecGS, an approach that utilizes a universal latent neural descriptor within each 3D Gaussian. This enables a more effective representation of 3D feature fields, including appearance and geometry. Moreover, two parallel CNNs are designed to decoder the splatting feature maps into diffuse color and specular color separately. A mask that depends on the viewpoint is learned to merge these two colors, resulting in the final rendered image. Experimental results demonstrate that our method obtains competitive performance in novel view synthesis and extends the ability of 3D-GS to handle intricate scenarios with specular reflections.
Paper Structure (24 sections, 16 equations, 6 figures, 5 tables)

This paper contains 24 sections, 16 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Pipeline of our proposed Latent-SpecGaussian.It consists of three steps: First, initializing with SfM points derived from COLMAP. Second, optimizing latent feature within each 3d gaussian and directional feature. Finally, splatting to generate multiple feature maps and decoding color using two parallel networks.
  • Figure 2: Schematic of Latent 3D-GS. In addition to the latent features attached to the 3DGS, we also use these features to predict normals and decode the viewpoint mask features.
  • Figure 3: Network Architecture. It includes a UNet to decode diffuse color and a CNN for decoding view-dependent color. The view-dependent color is multiplied by the view mask to superimpose the diffuse colors, resulting in the final rendering.
  • Figure 4: Sparse view condition results from the "Stump" scene on Mipnerf dataset.
  • Figure 5: Qualitative Comparisons on Shiny Dataset, Mipnerf360 Dataset and Deep Blending Dataset.
  • ...and 1 more figures