Adversarial Manhole: Challenging Monocular Depth Estimation and Semantic Segmentation Models with Patch Attack
Naufal Suryanto, Andro Aprila Adiputra, Ahmada Yusril Kadiptya, Yongsu Kim, Howon Kim
TL;DR
This work exposes vulnerabilities of monocular depth estimation and semantic segmentation to practical patch-based attacks by introducing Adversarial Manhole patches that resemble road manholes. A depth-aware Depth Planar Mapping combined with multiple losses and EOT enables dual-target manipulation of depth and semantics, achieving 43% relative error in MDE and 96% attack success in SS in simulation and physical-like conditions. The approach demonstrates robustness across placements and model families, with physical simulations in CARLA showing effective disruption from distances of about 2–2.5 meters. Ablation studies confirm the necessity of each component, and the results motivate the development of defenses against realistic patch-based threats in autonomous driving systems.
Abstract
Monocular depth estimation (MDE) and semantic segmentation (SS) are crucial for the navigation and environmental interpretation of many autonomous driving systems. However, their vulnerability to practical adversarial attacks is a significant concern. This paper presents a novel adversarial attack using practical patches that mimic manhole covers to deceive MDE and SS models. The goal is to cause these systems to misinterpret scenes, leading to false detections of near obstacles or non-passable objects. We use Depth Planar Mapping to precisely position these patches on road surfaces, enhancing the attack's effectiveness. Our experiments show that these adversarial patches cause a 43% relative error in MDE and achieve a 96% attack success rate in SS. These patches create affected error regions over twice their size in MDE and approximately equal to their size in SS. Our studies also confirm the patch's effectiveness in physical simulations, the adaptability of the patches across different target models, and the effectiveness of our proposed modules, highlighting their practical implications.
