Table of Contents
Fetching ...

Modeling Uncertainty in 3D Gaussian Splatting through Continuous Semantic Splatting

Joey Wilson, Marcelino Almeida, Min Sun, Sachit Mahajan, Maani Ghaffari, Parker Ewen, Omid Ghasemalizadeh, Cheng-Hao Kuo, Arnie Sen

TL;DR

A novel algorithm for probabilistically updating and rasterizing semantic maps within 3D Gaussian Splatting, combining the precise structure of 3D-GS with the ability to quantify uncertainty of probabilistic robotic maps.

Abstract

In this paper, we present a novel algorithm for probabilistically updating and rasterizing semantic maps within 3D Gaussian Splatting (3D-GS). Although previous methods have introduced algorithms which learn to rasterize features in 3D-GS for enhanced scene understanding, 3D-GS can fail without warning which presents a challenge for safety-critical robotic applications. To address this gap, we propose a method which advances the literature of continuous semantic mapping from voxels to ellipsoids, combining the precise structure of 3D-GS with the ability to quantify uncertainty of probabilistic robotic maps. Given a set of images, our algorithm performs a probabilistic semantic update directly on the 3D ellipsoids to obtain an expectation and variance through the use of conjugate priors. We also propose a probabilistic rasterization which returns per-pixel segmentation predictions with quantifiable uncertainty. We compare our method with similar probabilistic voxel-based methods to verify our extension to 3D ellipsoids, and perform ablation studies on uncertainty quantification and temporal smoothing.

Modeling Uncertainty in 3D Gaussian Splatting through Continuous Semantic Splatting

TL;DR

A novel algorithm for probabilistically updating and rasterizing semantic maps within 3D Gaussian Splatting, combining the precise structure of 3D-GS with the ability to quantify uncertainty of probabilistic robotic maps.

Abstract

In this paper, we present a novel algorithm for probabilistically updating and rasterizing semantic maps within 3D Gaussian Splatting (3D-GS). Although previous methods have introduced algorithms which learn to rasterize features in 3D-GS for enhanced scene understanding, 3D-GS can fail without warning which presents a challenge for safety-critical robotic applications. To address this gap, we propose a method which advances the literature of continuous semantic mapping from voxels to ellipsoids, combining the precise structure of 3D-GS with the ability to quantify uncertainty of probabilistic robotic maps. Given a set of images, our algorithm performs a probabilistic semantic update directly on the 3D ellipsoids to obtain an expectation and variance through the use of conjugate priors. We also propose a probabilistic rasterization which returns per-pixel segmentation predictions with quantifiable uncertainty. We compare our method with similar probabilistic voxel-based methods to verify our extension to 3D ellipsoids, and perform ablation studies on uncertainty quantification and temporal smoothing.

Paper Structure

This paper contains 14 sections, 18 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: While 3D-GS may provide high quality renderings of the environment at novel views with sufficient training data, it may fail to render views which are occluded, unseen, or at different angles from the training data. In the above image, CSS produces semantic (c) and RGB (b) predictions at a novel view without sufficient training data, resulting in a blurry render and incorrect segmentation. Through probabilistic inference, CSS identifies blurs and gaps in the render which correlate with reconstruction quality (d).
  • Figure 2: 3D-GS renders pixels as a linear combination of 3D ellipsoids, where the influence of each 3D ellipsoid is determined by the spatial position and shape of the ellipsoid $x_n$ relative to pixel $x_i$ as $\kappa(x_i, x_n)$. We propose to leverage the learned expressive kernels of 3D-GS to perform a probabilistic semantic update and rasterization which enables uncertainty quantification.
  • Figure 3: Comparison of our method to a probabilistic voxel baseline on the KITTI driving dataset. Our method achieves similar segmentation results on pixels predicted by the voxel method, and predicts more of the scene due to the lack of a requirement for accurate depth. Additionally, our method does not have discretization, which is beneficial for fine categories such as poles.
  • Figure 4: Rasterizations from our method on an indoor environment. Our method achieves high quality semantic renderings using ground truth segmentation, shown in (b). Even with noisy segmentation input our method is capable of improving the segmentation with temporal smoothing (c), and can quantify uncertainty (d).
  • Figure 5: Sparsification plot of pixel-level and image-level uncertainty. Uncertainty quantification from the expectation and variance are effective at both the image and pixel level.