Beyond Anonymization: Object Scrubbing for Privacy-Preserving 2D and 3D Vision Tasks
Murat Bilgehan Ertan, Ronak Sahu, Phuong Ha Nguyen, Kaleel Mahmood, Marten van Dijk
TL;DR
ROAR presents a privacy-preserving dataset obfuscation framework that removes sensitive objects through instance segmentation and generative inpainting, preserving scene integrity for both 2D detection and 3D NeRF reconstruction. It combines Mask2Former detection, latent-diffusion or GAN-based inpainting, and an oracle-based re-annotation step to quantify privacy-utility trade-offs on COCO and NeRF datasets. Results show scrubbing maintains substantially higher utility than image dropping, achieving around 87.5% of baseline AP in 2D and at most a 1.66 dB PSNR loss in 3D reconstruction, with diffusion-based methods delivering superior perceptual quality. Overall, ROAR demonstrates that object removal can offer strong privacy guarantees with minimal impact on downstream tasks, while highlighting areas for segmentation robustness and cross-task generalization.
Abstract
We introduce ROAR (Robust Object Removal and Re-annotation), a scalable framework for privacy-preserving dataset obfuscation that eliminates sensitive objects instead of modifying them. Our method integrates instance segmentation with generative inpainting to remove identifiable entities while preserving scene integrity. Extensive evaluations on 2D COCO-based object detection show that ROAR achieves 87.5% of the baseline detection average precision (AP), whereas image dropping achieves only 74.2% of the baseline AP, highlighting the advantage of scrubbing in preserving dataset utility. The degradation is even more severe for small objects due to occlusion and loss of fine-grained details. Furthermore, in NeRF-based 3D reconstruction, our method incurs a PSNR loss of at most 1.66 dB while maintaining SSIM and improving LPIPS, demonstrating superior perceptual quality. Our findings establish object removal as an effective privacy framework, achieving strong privacy guarantees with minimal performance trade-offs. The results highlight key challenges in generative inpainting, occlusion-robust segmentation, and task-specific scrubbing, setting the foundation for future advancements in privacy-preserving vision systems.
