Fractured Glass, Failing Cameras: Simulating Physics-Based Adversarial Samples for Autonomous Driving Systems
Manav Prabhakar, Jwalandhar Girnar, Arpan Kusari
TL;DR
This work identifies fractured camera enclosures as a realistic, physics-based source of adversarial samples for autonomous driving perception. It introduces a scalable pipeline that couples a 2D finite-element fracture model with minimum spanning-tree crack propagation and a physically-based rendering (PBR) stage to generate realistic cracked-glass overlays on datasets such as KITTI, BDD100K, and MS-COCO. By evaluating with detectors like YOLOv8, Faster R-CNN, and Pyramid Vision Transformer (PVTv2), the study demonstrates substantial detection perturbations and uses KL divergence to show the generated samples closely approximate real cracked-glass patterns, while ablations confirm robustness and scalability (~1.6s per sample). The work highlights the need for defense mechanisms against physically plausible camera failures and proposes a transferable, black-box approach for producing adversarial data to advance robust perception research in autonomous driving.
Abstract
While much research has recently focused on generating physics-based adversarial samples, a critical yet often overlooked category originates from physical failures within on-board cameras-components essential to the perception systems of autonomous vehicles. Camera failures, whether due to external stresses causing hardware breakdown or internal component faults, can directly jeopardize the safety and reliability of autonomous driving systems. Firstly, we motivate the study using two separate real-world experiments to showcase that indeed glass failures would cause the detection based neural network models to fail. Secondly, we develop a simulation-based study using the physical process of the glass breakage to create perturbed scenarios, representing a realistic class of physics-based adversarial samples. Using a finite element model (FEM)-based approach, we generate surface cracks on the camera image by applying a stress field defined by particles within a triangular mesh. Lastly, we use physically-based rendering (PBR) techniques to provide realistic visualizations of these physically plausible fractures. To assess the safety implications, we apply the simulated broken glass effects as image filters to two autonomous driving datasets- KITTI and BDD100K- as well as the large-scale image detection dataset MS-COCO. We then evaluate detection failure rates for critical object classes using CNN-based object detection models (YOLOv8 and Faster R-CNN) and a transformer-based architecture with Pyramid Vision Transformers. To further investigate the distributional impact of these visual distortions, we compute the Kullback-Leibler (K-L) divergence between three distinct data distributions, applying various broken glass filters to a custom dataset (captured through a cracked windshield), as well as the KITTI and Kaggle cats and dogs datasets.
