AD$^2$: Analysis and Detection of Adversarial Threats in Visual Perception for End-to-End Autonomous Driving Systems
Ishan Sahu, Somnath Hazra, Somak Aditya, Soumyajit Dey
TL;DR
This work addresses the robustness of end-to-end autonomous driving systems under adversarial perturbations in the visual perception pipeline. It performs closed-loop CARLA experiments with three black-box attack vectors— Poltergeist (motion blur from acoustic interference), SNAL (ghost object injection), and ESIA (electromagnetic interference)—on state-of-the-art agents and demonstrates driving-score degradations up to 99%. To mitigate such threats, it introduces AD$^2$, a lightweight, real-time detector that leverages spatial-temporal attention over multi-camera inputs to identify adversarial frames without requiring access to internal agent latents, achieving superior detection performance (higher AUC/TPR, lower FPR) and efficiency (≈1.6× faster inference and ≈20× fewer parameters vs. a strong baseline). The results underscore persistent vulnerabilities in end-to-end AD systems under visual attacks while showing that external detectors like AD$^2$ can enable safer operation, for example by triggering alarms or safe-mode controllers; future work should address adaptive adversaries and real-world deployment considerations. Key metrics include Driving Score $DS = R imes P$, Route Completion $R$, Infraction Penalty $P$, and Lane Deviation $L_{ ext{dev}}$, with $R$ and $P$ formalized in the paper and $DS$ reflecting overall safety performance.
Abstract
End-to-end autonomous driving systems have achieved significant progress, yet their adversarial robustness remains largely underexplored. In this work, we conduct a closed-loop evaluation of state-of-the-art autonomous driving agents under black-box adversarial threat models in CARLA. Specifically, we consider three representative attack vectors on the visual perception pipeline: (i) a physics-based blur attack induced by acoustic waves, (ii) an electromagnetic interference attack that distorts captured images, and (iii) a digital attack that adds ghost objects as carefully crafted bounded perturbations on images. Our experiments on two advanced agents, Transfuser and Interfuser, reveal severe vulnerabilities to such attacks, with driving scores dropping by up to 99% in the worst case, raising valid safety concerns. To help mitigate such threats, we further propose a lightweight Attack Detection model for Autonomous Driving systems (AD$^2$) based on attention mechanisms that capture spatial-temporal consistency. Comprehensive experiments across multi-camera inputs on CARLA show that our detector achieves superior detection capability and computational efficiency compared to existing approaches.
