Table of Contents
Fetching ...

Is there anything left? Measuring semantic residuals of objects removed from 3D Gaussian Splatting

Simona Kocour, Assia Benbihi, Aikaterini Adam, Torsten Sattler

TL;DR

This work tackles the privacy-oriented problem of residual information after removing objects from 3D Gaussian Splatting representations. It introduces a quantitative framework combining semantic, instance-level, and depth-based metrics, including $IoU_{drop}$, $acc_{seg}$, $sim_{SAM}$, and $acc_{ riangle depth}$, and couples them with a graph-cut based removal refinement that enforces spatial and semantic consistency. Through experiments on indoor/outdoor scenes with multiple removal methods, the authors show that methods like GaussianCut and GaussianGrouping yield strong removals, while a user study supports the alignment between metrics and perceived removal quality. The results advance privacy-preserving mapping by providing robust, multi-faceted evaluation tools and a practical refinement technique, along with open-source code and data to catalyze further research.

Abstract

Searching in and editing 3D scenes has become extremely intuitive with trainable scene representations that allow linking human concepts to elements in the scene. These operations are often evaluated on the basis of how accurately the searched element is segmented or extracted from the scene. In this paper, we address the inverse problem, that is, how much of the searched element remains in the scene after it is removed. This question is particularly important in the context of privacy-preserving mapping when a user reconstructs a 3D scene and wants to remove private elements before sharing the map. To the best of our knowledge, this is the first work to address this question. To answer this, we propose a quantitative evaluation that measures whether a removal operation leaves object residuals that can be reasoned over. The scene is not private when such residuals are present. Experiments on state-of-the-art scene representations show that the proposed metrics are meaningful and consistent with the user study that we also present. We also propose a method to refine the removal based on spatial and semantic consistency.

Is there anything left? Measuring semantic residuals of objects removed from 3D Gaussian Splatting

TL;DR

This work tackles the privacy-oriented problem of residual information after removing objects from 3D Gaussian Splatting representations. It introduces a quantitative framework combining semantic, instance-level, and depth-based metrics, including , , , and , and couples them with a graph-cut based removal refinement that enforces spatial and semantic consistency. Through experiments on indoor/outdoor scenes with multiple removal methods, the authors show that methods like GaussianCut and GaussianGrouping yield strong removals, while a user study supports the alignment between metrics and perceived removal quality. The results advance privacy-preserving mapping by providing robust, multi-faceted evaluation tools and a practical refinement technique, along with open-source code and data to catalyze further research.

Abstract

Searching in and editing 3D scenes has become extremely intuitive with trainable scene representations that allow linking human concepts to elements in the scene. These operations are often evaluated on the basis of how accurately the searched element is segmented or extracted from the scene. In this paper, we address the inverse problem, that is, how much of the searched element remains in the scene after it is removed. This question is particularly important in the context of privacy-preserving mapping when a user reconstructs a 3D scene and wants to remove private elements before sharing the map. To the best of our knowledge, this is the first work to address this question. To answer this, we propose a quantitative evaluation that measures whether a removal operation leaves object residuals that can be reasoned over. The scene is not private when such residuals are present. Experiments on state-of-the-art scene representations show that the proposed metrics are meaningful and consistent with the user study that we also present. We also propose a method to refine the removal based on spatial and semantic consistency.

Paper Structure

This paper contains 18 sections, 6 equations, 21 figures, 8 tables.

Figures (21)

  • Figure 1: Is there any table left after it is removed from the scene? When there remain residuals of the object, and they can be reasoned over (left column, mid-row), the object removal is imperfect. This is what the proposed evaluation measures based on whether off-the-shelf semantic models can still segment the object after the removal. We also measure the presence of residuals in 3D with depth (right column, mid-row). We derive an optimization to refine the removal based on spatial and semantic consistency (highlighted in pink the bottom-left image). Top-Bottom: Rendering from the original 3DGS kerbl3Dgaussians scene, rendering after removing the table from the scene, rendering after removal and refinement. Left-Right: RGB renderings, SAM kirillov2023segany masks overlay with pseudo-ground-truth object outline, GroundedSAM kirillov2023seganyliu2023groundingren2024grounded overlay, depth renderings.
  • Figure 2: SAM kirillov2023seganyravi2024sam2 masks before and after object removal. Left-right: before and after removal. Rows: The evaluated methods. When the object (green outline) is removed (row 1 and 4), the SAM masks change, which is the signal we exploit to evaluate the removal: larger changes indicate better removal.
  • Figure 3: Depth Changes before and after removal. Left to right: original rendered depth, rendered depth after removal, thresholded depth difference and pseudo-ground-truth outline of the object to be removed in green. Top: GaussianCut jain2024gaussiancut shows precise and localized changes in the depth maps, indicating accurate object removal. Down: Feature3DGS zhou2024feature fails to remove the object and the difference at the object location remains unchanged.
  • Figure 4: Segmentation changes before and after removal. Left-right: GroundedSAM kirillov2023seganyliu2023groundingren2024grounded overlay on the rendering before removal, rendering after removal, overlay after removal. The object is removed successfully but the segmentation model still finds it. One explanation can be that the pixel distribution on the edited area still exhibits patterns characteristic of the object, similar to what occurs in adversarial attacks. Even though the object can not be recognized by a human, GroundedSAM kirillov2023seganyliu2023groundingren2024grounded still manages to segment the object.
  • Figure 5: Removal before and after refinement. Left: removal before refinement. The area to be removed by the refinement is highlighted in pink. Right: removal after refinement. The refinement captures the object's boundaries (top) and even relevant affordances, such as the elements on the tables to be removed.
  • ...and 16 more figures