Table of Contents
Fetching ...

Remove360: Benchmarking Residuals After Object Removal in 3D Gaussian Splatting

Simona Kocour, Assia Benbihi, Torsten Sattler

Abstract

An object can disappear from a 3D scene, yet still be detectable. Even after visual removal, modern vision models may infer what was originally present. In this work, we introduce a novel benchmark and evaluation framework to quantify semantic residuals, the unintended cues left behind after object removal in 3D Gaussian Splatting. We conduct experiments across a diverse set of indoor and outdoor scenes, showing that current methods often preserve semantic information despite the absence of visual geometry. Notably, even when removal is followed by inpainting, residual cues frequently remain detectable by foundation models. We also present Remove360, a real-world dataset of pre- and post-removal RGB captures with object-level masks. Unlike prior datasets focused on isolated object instances, Remove360 contains complex, cluttered scenes that enable evaluation of object removal in full-scene settings. By leveraging the ground-truth post-removal images, we directly assess whether semantic presence is eliminated and whether downstream models can still infer what was removed. Our results reveal a consistent gap between geometric removal and semantic erasure, exposing critical limitations in existing 3D editing pipelines and highlighting the need for privacy-aware removal methods that eliminate recoverable cues, not only visible geometry. Dataset and evaluation code are publicly available.

Remove360: Benchmarking Residuals After Object Removal in 3D Gaussian Splatting

Abstract

An object can disappear from a 3D scene, yet still be detectable. Even after visual removal, modern vision models may infer what was originally present. In this work, we introduce a novel benchmark and evaluation framework to quantify semantic residuals, the unintended cues left behind after object removal in 3D Gaussian Splatting. We conduct experiments across a diverse set of indoor and outdoor scenes, showing that current methods often preserve semantic information despite the absence of visual geometry. Notably, even when removal is followed by inpainting, residual cues frequently remain detectable by foundation models. We also present Remove360, a real-world dataset of pre- and post-removal RGB captures with object-level masks. Unlike prior datasets focused on isolated object instances, Remove360 contains complex, cluttered scenes that enable evaluation of object removal in full-scene settings. By leveraging the ground-truth post-removal images, we directly assess whether semantic presence is eliminated and whether downstream models can still infer what was removed. Our results reveal a consistent gap between geometric removal and semantic erasure, exposing critical limitations in existing 3D editing pipelines and highlighting the need for privacy-aware removal methods that eliminate recoverable cues, not only visible geometry. Dataset and evaluation code are publicly available.

Paper Structure

This paper contains 23 sections, 4 equations, 20 figures, 10 tables.

Figures (20)

  • Figure 1: Residual semantic cues after object removal in 3D Gaussian Splatting. Although the table is visually removed, segmentation and depth models can still detect traces of its prior presence. Top: scene before removal. Bottom: scene after removal. Left to right: RGB rendering, SAM kirillov2023segany masks, GroundedSAM kirillov2023seganyliu2023groundingren2024grounded overlay, and depth map.
  • Figure 2: Semantic segmentation changes before and after removal on Remove360. Left to right: GroundedSAM2 kirillov2023seganyliu2023groundingren2024grounded detections for the renderings before removal, the renderings after removal, and GroundedSAM2 detections for the renderings after the removal. The semantic masks are used to calculate changes in semantic segmentation in Eq. \ref{['eq:gsam_iou']} and its accuracy Eq. \ref{['eq:acc_seg']}. Rows: Different object removals. Though the removed objects can not be recognized by a human, the segmentation model is still able to recognize them. The pixel distribution after removal in the edited area that might still exhibit patterns characteristic of the object, similar to what occurs in adversarial attacks. This is not a false positive detection, because the ground-truth post-removal images have no semantics detected.
  • Figure 3: SAM kirillov2023seganyravi2024sam2 mask comparison on Remove360. Object removal alters SAM masks, and smaller changes relative to ground-truth masks indicate better removal. These differences are used to compute the similarity score Eq. \ref{['eq:sim_sam']}. Left to right: SAM overlay before removal, after removal, and ground-truth with the object mask (green outline).
  • Figure 4: Depth changes before and after removal. Left to right: Rendered depth before removal, rendered depth after removal, thresholded depth difference and ground-truth outline of the object to be removed in green. This depth difference is used for evaluation in Eq. \ref{['eq:acc_depth']} and highlights the regions where geometry was modified. In this case, the visualization illustrates under-removal within the target object area.
  • Figure 5: Overview of the Remove360 dataset. Samples from 11 scenes (5 indoor, 6 outdoor) with varied object counts, layouts, and interactions. Removed objects are highlighted with bounding boxes.
  • ...and 15 more figures