Table of Contents
Fetching ...

Toward Robust and Accurate Adversarial Camouflage Generation against Vehicle Detectors

Jiawei Zhou, Linye Lyu, Daojing He, Yu Li

TL;DR

RAUCA tackles the challenge of robust physical adversarial camouflage for vehicle detectors under varying weather and viewpoints. It introduces End-to-End Neural Renderer Plus (E2E-NRP) and an Environment Feature Extractor (EFE) along with a CARLA-based multi-weather dataset to enable genuine end-to-end UV-map optimization and realistic environmental rendering. Across simulation and real-world tests on multiple detectors, RAUCA-final achieves superior attack performance and robustness, aided by a pre-trained EFE that accelerates adaptation to unseen vehicles. This work advances practical, robust physical attacks for autonomous driving safety research and provides open-source tooling for further exploration.

Abstract

Adversarial camouflage is a widely used physical attack against vehicle detectors for its superiority in multi-view attack performance. One promising approach involves using differentiable neural renderers to facilitate adversarial camouflage optimization through gradient back-propagation. However, existing methods often struggle to capture environmental characteristics during the rendering process or produce adversarial textures that can precisely map to the target vehicle. Moreover, these approaches neglect diverse weather conditions, reducing the efficacy of generated camouflage across varying weather scenarios. To tackle these challenges, we propose a robust and accurate camouflage generation method, namely RAUCA. The core of RAUCA is a novel neural rendering component, End-to-End Neural Renderer Plus (E2E-NRP), which can accurately optimize and project vehicle textures and render images with environmental characteristics such as lighting and weather. In addition, we integrate a multi-weather dataset for camouflage generation, leveraging the E2E-NRP to enhance the attack robustness. Experimental results on six popular object detectors show that RAUCA-final outperforms existing methods in both simulation and real-world settings.

Toward Robust and Accurate Adversarial Camouflage Generation against Vehicle Detectors

TL;DR

RAUCA tackles the challenge of robust physical adversarial camouflage for vehicle detectors under varying weather and viewpoints. It introduces End-to-End Neural Renderer Plus (E2E-NRP) and an Environment Feature Extractor (EFE) along with a CARLA-based multi-weather dataset to enable genuine end-to-end UV-map optimization and realistic environmental rendering. Across simulation and real-world tests on multiple detectors, RAUCA-final achieves superior attack performance and robustness, aided by a pre-trained EFE that accelerates adaptation to unseen vehicles. This work advances practical, robust physical attacks for autonomous driving safety research and provides open-source tooling for further exploration.

Abstract

Adversarial camouflage is a widely used physical attack against vehicle detectors for its superiority in multi-view attack performance. One promising approach involves using differentiable neural renderers to facilitate adversarial camouflage optimization through gradient back-propagation. However, existing methods often struggle to capture environmental characteristics during the rendering process or produce adversarial textures that can precisely map to the target vehicle. Moreover, these approaches neglect diverse weather conditions, reducing the efficacy of generated camouflage across varying weather scenarios. To tackle these challenges, we propose a robust and accurate camouflage generation method, namely RAUCA. The core of RAUCA is a novel neural rendering component, End-to-End Neural Renderer Plus (E2E-NRP), which can accurately optimize and project vehicle textures and render images with environmental characteristics such as lighting and weather. In addition, we integrate a multi-weather dataset for camouflage generation, leveraging the E2E-NRP to enhance the attack robustness. Experimental results on six popular object detectors show that RAUCA-final outperforms existing methods in both simulation and real-world settings.

Paper Structure

This paper contains 23 sections, 7 equations, 12 figures, 12 tables, 2 algorithms.

Figures (12)

  • Figure 1: Comparison of different adversarial camouflage under sunny (first row) and foggy (second row) environments, where only our method succeeds in both cases. (a) A car with normal texture. (b) DAS wang2021dual. (c) and (d) are top-performed methods FCA wang2022fca and ACTIVE Suryanto_2023_ICCV, respectively. (e) Our method RAUCA-final.
  • Figure 2: The overview of RAUCA. First, we create a multi-weather dataset using CARLA, which includes car images, corresponding mask images, and camera transformation sets. Then, the car images are segmented using the mask images to obtain the foreground car and background images. The foreground car image, the 3D model, and the camera transformation are passed through the E2E-NRP rendering component for rendering. The rendered image is then seamlessly integrated with the background. After a series of random output augmentation, the image is fed into the object detector. Finally, we optimize the adversarial camouflage through back-propagation with our devised loss function computed from the output of the object detector.
  • Figure 3: The illustration of tensor-traversal sampling and its issue. The triangle represents a facet's area in the UV map. (a) illustrates the tensor-traversal sampling method's projection of a facet from $T_{fc}$ onto the UV map. The projected positions of $T_{fc}$ are represented by green points, while black points denote the pixel points on the UV map. (b) illustrates the issue of this sampling method: the red points are optimized during the camouflage generation, while the cyan points are not optimized. The numerous unoptimized points can lead to a decline in attack effectiveness. (c) is the desired situation in which all points on the UV map can be optimized. (d) shows the optimization result using tensor-traversal sampling, which leads to camouflage patterns resembling random noise with poor continuity. In contrast, (e) depicts the result of UV-travel sampling, exhibiting strong texture continuity and yielding more effective and robust attacks.
  • Figure 4: The comparison of adversarial camouflage generated by the neural renderer with different sampling methods. (a) Tensor-traversal sampling method: most areas appear as random noise. (b) Our proposed UV-traversal sampling method: most areas are strong adversarial texture patterns.
  • Figure 5: The comparison of EFE training methods in our base version and final version method. The blue and pink parts, respectively, represent enhancements to the green and yellow parts, which are part of the EFE training process in our base version method. In the original training process, the times of the EFE network inference and rendering both are $num(Epochs)$$\times$$num(\Phi_{M})$$\times$$num(\Phi_{cam})$$\times$$num(\Phi_{c})$. We first transform the online rendering process (green part) into constructing an offline-rendered vehicle image set(blue part), thus reducing the rendering times to $num(\Phi_{cam})$$\times$$num(\Phi_{c})$. Next, We use the output from a single EFE inference to combine with multiple color vehicle images (pink part), instead of using one output per color vehicle image as in the base version (yellow part). This reduces the number of EFE network inference times to $num(Epochs)$$\times$$num(\Phi_{M})$$\times$$num(\Phi_{cam})$.
  • ...and 7 more figures