Gaussian Splatting Under Attack: Investigating Adversarial Noise in 3D Objects
Abdurrahman Zeybey, Mehmet Ergezer, Tommy Nguyen
TL;DR
The paper investigates the vulnerability of vision-language models, specifically CLIP ViT-B/16, to adversarial perturbations on 3D objects reconstructed by Gaussian Splatting. It introduces the Masked Iterative Fast Gradient Sign Method (M-IFGSM), which concentrates perturbations on object masks generated by SAM, and demonstrates that these perturbations significantly degrade CLIP's top-1 and top-5 accuracy for both original 2D inputs and 3D-rendered outputs from eight CO3D objects. The authors propose a three-stage pipeline—mask extraction, masked adversarial perturbation, and 3D model reconstruction—to transfer 2D adversarial noise into the 3D domain with minimal perceptual change. Key findings show substantial performance drops (e.g., top-1 accuracy from 95.4% to 12.5% on training images and from 91.2% to 35.4% on test images) and transferable degradation in rendered 3DGS models, highlighting potential risks in autonomous driving, robotics, and surveillance and prompting development of robust defenses for 3D vision-language systems.
Abstract
3D Gaussian Splatting has advanced radiance field reconstruction, enabling high-quality view synthesis and fast rendering in 3D modeling. While adversarial attacks on object detection models are well-studied for 2D images, their impact on 3D models remains underexplored. This work introduces the Masked Iterative Fast Gradient Sign Method (M-IFGSM), designed to generate adversarial noise targeting the CLIP vision-language model. M-IFGSM specifically alters the object of interest by focusing perturbations on masked regions, degrading the performance of CLIP's zero-shot object detection capability when applied to 3D models. Using eight objects from the Common Objects 3D (CO3D) dataset, we demonstrate that our method effectively reduces the accuracy and confidence of the model, with adversarial noise being nearly imperceptible to human observers. The top-1 accuracy in original model renders drops from 95.4\% to 12.5\% for train images and from 91.2\% to 35.4\% for test images, with confidence levels reflecting this shift from true classification to misclassification, underscoring the risks of adversarial attacks on 3D models in applications such as autonomous driving, robotics, and surveillance. The significance of this research lies in its potential to expose vulnerabilities in modern 3D vision models, including radiance fields, prompting the development of more robust defenses and security measures in critical real-world applications.
