Table of Contents
Fetching ...

FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally

Qiuhong Shen, Xingyi Yang, Xinchao Wang

TL;DR

The core insight of the method is that, with a reconstructed 3D-GS scene, the rendering of the 2D masks is essentially a linear function with respect to the labels of each Gaussian, and the optimal label assignment can be solved via linear programming in closed form.

Abstract

This study addresses the challenge of accurately segmenting 3D Gaussian Splatting from 2D masks. Conventional methods often rely on iterative gradient descent to assign each Gaussian a unique label, leading to lengthy optimization and sub-optimal solutions. Instead, we propose a straightforward yet globally optimal solver for 3D-GS segmentation. The core insight of our method is that, with a reconstructed 3D-GS scene, the rendering of the 2D masks is essentially a linear function with respect to the labels of each Gaussian. As such, the optimal label assignment can be solved via linear programming in closed form. This solution capitalizes on the alpha blending characteristic of the splatting process for single step optimization. By incorporating the background bias in our objective function, our method shows superior robustness in 3D segmentation against noises. Remarkably, our optimization completes within 30 seconds, about 50$\times$ faster than the best existing methods. Extensive experiments demonstrate the efficiency and robustness of our method in segmenting various scenes, and its superior performance in downstream tasks such as object removal and inpainting. Demos and code will be available at https://github.com/florinshen/FlashSplat.

FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally

TL;DR

The core insight of the method is that, with a reconstructed 3D-GS scene, the rendering of the 2D masks is essentially a linear function with respect to the labels of each Gaussian, and the optimal label assignment can be solved via linear programming in closed form.

Abstract

This study addresses the challenge of accurately segmenting 3D Gaussian Splatting from 2D masks. Conventional methods often rely on iterative gradient descent to assign each Gaussian a unique label, leading to lengthy optimization and sub-optimal solutions. Instead, we propose a straightforward yet globally optimal solver for 3D-GS segmentation. The core insight of our method is that, with a reconstructed 3D-GS scene, the rendering of the 2D masks is essentially a linear function with respect to the labels of each Gaussian. As such, the optimal label assignment can be solved via linear programming in closed form. This solution capitalizes on the alpha blending characteristic of the splatting process for single step optimization. By incorporating the background bias in our objective function, our method shows superior robustness in 3D segmentation against noises. Remarkably, our optimization completes within 30 seconds, about 50 faster than the best existing methods. Extensive experiments demonstrate the efficiency and robustness of our method in segmenting various scenes, and its superior performance in downstream tasks such as object removal and inpainting. Demos and code will be available at https://github.com/florinshen/FlashSplat.
Paper Structure (27 sections, 6 equations, 11 figures, 3 tables)

This paper contains 27 sections, 6 equations, 11 figures, 3 tables.

Figures (11)

  • Figure 1: GS Rasterization. Here we illustrate projected 2D Gaussians in 3DGS rasterization, setting each tile as $2\times 2$ for illustration purposes. Gaussians are shared between different tiles and are likely to be shared among different instances (color blocks).
  • Figure 2: The effect of the background bias $\gamma$. By adjusting the background bias in our optimal assignment, the noisy 3D segmentation caused by 2D masks can be flexibly mitigated for various downstream applications.
  • Figure 3: Novel view mask rendering. Here we showcase mask rendering on novel views for both binary and scene segmentation. Simple alpha mask quantization can generate consistent masks in binary segmentation. With depth guidance, scene segmentation also can generate feasible 2D masks in novel views.
  • Figure 4: Qualitative result of FlashSplat. FlashSplat is capable of performing both binary segmentation and scene segmentation. With our single step optimal assignment, all 3D segmentation is completed within 30 seconds. These segmentation results show our FlashSplat can robustly segment 3D objects and remove objects in 3D scenes.
  • Figure 5: Object inpainting after removal. Here we showcase the object removal results from 3D scenes, including the kitchen scene from MIP-360 dataset and bear scene from Instruct-NeRF2NeRF instance-nerf. After removal, we inpaint the regions with artifacts by tuning the optimized 3DGS parameters. Artifacts after object removal are diminished by our object inpainting.
  • ...and 6 more figures