Table of Contents
Fetching ...

Feature Splatting for Better Novel View Synthesis with Low Overlap

T. Berriel Martins, Javier Civera

TL;DR

FeatSplat replaces spherical-harmonic color encodings in 3D Gaussian Splatting with learnable per-Gaussian feature vectors, enabling richer textures and better generalization for novel view synthesis, especially at low view overlap. By alpha-blending Gaussian features to form per-pixel features and decoding them with a compact MLP conditioned on a camera embedding, the method supports both high-quality RGB rendering and per-pixel semantic segmentation. Empirical results on Mip-360, Tanks and Temples, Deep Blending, and ScanNet++ show FeatSplat achieving superior or competitive PSNR, SSIM, and LPIPS, while reducing memory and enabling real-time rendering; 32-dim features often outperform 16-dim variants. The approach also supports lighting manipulation at inference and extends to semantic segmentation with modest additional capacity, marking a practical, flexible improvement over SH-based 3DGS with meaningful implications for robotics, AR/VR, and open-vocabulary tasks.

Abstract

3D Gaussian Splatting has emerged as a very promising scene representation, achieving state-of-the-art quality in novel view synthesis significantly faster than competing alternatives. However, its use of spherical harmonics to represent scene colors limits the expressivity of 3D Gaussians and, as a consequence, the capability of the representation to generalize as we move away from the training views. In this paper, we propose to encode the color information of 3D Gaussians into per-Gaussian feature vectors, which we denote as Feature Splatting (FeatSplat). To synthesize a novel view, Gaussians are first "splatted" into the image plane, then the corresponding feature vectors are alpha-blended, and finally the blended vector is decoded by a small MLP to render the RGB pixel values. To further inform the model, we concatenate a camera embedding to the blended feature vector, to condition the decoding also on the viewpoint information. Our experiments show that these novel model for encoding the radiance considerably improves novel view synthesis for low overlap views that are distant from the training views. Finally, we also show the capacity and convenience of our feature vector representation, demonstrating its capability not only to generate RGB values for novel views, but also their per-pixel semantic labels. Code available at https://github.com/tberriel/FeatSplat . Keywords: Gaussian Splatting, Novel View Synthesis, Feature Splatting

Feature Splatting for Better Novel View Synthesis with Low Overlap

TL;DR

FeatSplat replaces spherical-harmonic color encodings in 3D Gaussian Splatting with learnable per-Gaussian feature vectors, enabling richer textures and better generalization for novel view synthesis, especially at low view overlap. By alpha-blending Gaussian features to form per-pixel features and decoding them with a compact MLP conditioned on a camera embedding, the method supports both high-quality RGB rendering and per-pixel semantic segmentation. Empirical results on Mip-360, Tanks and Temples, Deep Blending, and ScanNet++ show FeatSplat achieving superior or competitive PSNR, SSIM, and LPIPS, while reducing memory and enabling real-time rendering; 32-dim features often outperform 16-dim variants. The approach also supports lighting manipulation at inference and extends to semantic segmentation with modest additional capacity, marking a practical, flexible improvement over SH-based 3DGS with meaningful implications for robotics, AR/VR, and open-vocabulary tasks.

Abstract

3D Gaussian Splatting has emerged as a very promising scene representation, achieving state-of-the-art quality in novel view synthesis significantly faster than competing alternatives. However, its use of spherical harmonics to represent scene colors limits the expressivity of 3D Gaussians and, as a consequence, the capability of the representation to generalize as we move away from the training views. In this paper, we propose to encode the color information of 3D Gaussians into per-Gaussian feature vectors, which we denote as Feature Splatting (FeatSplat). To synthesize a novel view, Gaussians are first "splatted" into the image plane, then the corresponding feature vectors are alpha-blended, and finally the blended vector is decoded by a small MLP to render the RGB pixel values. To further inform the model, we concatenate a camera embedding to the blended feature vector, to condition the decoding also on the viewpoint information. Our experiments show that these novel model for encoding the radiance considerably improves novel view synthesis for low overlap views that are distant from the training views. Finally, we also show the capacity and convenience of our feature vector representation, demonstrating its capability not only to generate RGB values for novel views, but also their per-pixel semantic labels. Code available at https://github.com/tberriel/FeatSplat . Keywords: Gaussian Splatting, Novel View Synthesis, Feature Splatting
Paper Structure (27 sections, 5 equations, 11 figures, 4 tables)

This paper contains 27 sections, 5 equations, 11 figures, 4 tables.

Figures (11)

  • Figure 1: Overview of our approach. Left: We augment the 3D Gaussian Splatting kerbl20233d representation with learned feature vectors $\mathbf{f}$ to encode colors, removing the spherical harmonics. Center: To render a point of view, 3D Gaussians are projected to the image plane, where the differentiable rasterizer alpha-blends corresponding feature vectors. Resulting vectors are concatenated with a camera embedding, and a tiny MLP renders the final RGB values, with potentially also higher level information such as semantic labels. Right: Illustrative result of novel view synthesis result with low overlap, with RGB values and semantic labels.
  • Figure 2: Comparison of 3D position of test (red) and training (blue) points of view on one scene for each of the four datasets.
  • Figure 3: Comparison of our method trained with features with 16 (FeatSplat--16) and 32 dimensions (FeatSplat--32), against prior work. From top to down the scenes are Bicycle, Bonsai, and Room from Mip-360 barron2022mipnerf360, and Playroom from DB hedman2018deep.
  • Figure 4: Novel view synthesis with low overlap of FeatSplat--32 against 3DGS. From left to right the scenes are Train from T&Ts knapitsch2017tanks, Treehill, and Bicycle from Mip-360 barron2022mipnerf360.
  • Figure 5: Global illumination change modifying MLP input values of the pixel embedding (PE), and the 3D coordinates (X, Y, Z), on scene 03f7a0e617 from ScanNet++.
  • ...and 6 more figures