Embodied Laser Attack:Leveraging Scene Priors to Achieve Agent-based Robust Non-contact Attacks

Yitong Sun; Yao Huang; Xingxing Wei

Embodied Laser Attack:Leveraging Scene Priors to Achieve Agent-based Robust Non-contact Attacks

Yitong Sun, Yao Huang, Xingxing Wei

TL;DR

The paper addresses robustness gaps of physical adversarial attacks in dynamic, real-world settings, particularly for non-contact laser attacks in traffic scenarios. It proposes Embodied Laser Attack (ELA), a Perception-Decision-Control framework that uses a Perspective Transformation Network (PTN) to infer the victim view from attacker observations and a reinforcement learning agent to select laser parameters in real time. Key contributions include (1) a PTN that exploits traffic scene priors for fast, local perspective estimation, (2) an agent-based decision module trained with reinforcement learning to produce instant attack strategies, and (3) comprehensive experiments in CARLA and physically inspired scenarios showing improved attack success rates and speed over fixed or offline methods. The results highlight practical security implications for vision systems in traffic and provide a framework for evaluating robustness and concealment of non-contact adversarial attacks.

Abstract

As physical adversarial attacks become extensively applied in unearthing the potential risk of security-critical scenarios, especially in dynamic scenarios, their vulnerability to environmental variations has also been brought to light. The non-robust nature of physical adversarial attack methods brings less-than-stable performance consequently. Although methods such as EOT have enhanced the robustness of traditional contact attacks like adversarial patches, they fall short in practicality and concealment within dynamic environments such as traffic scenarios. Meanwhile, non-contact laser attacks, while offering enhanced adaptability, face constraints due to a limited optimization space for their attributes, rendering EOT less effective. This limitation underscores the necessity for developing a new strategy to augment the robustness of such practices. To address these issues, this paper introduces the Embodied Laser Attack (ELA), a novel framework that leverages the embodied intelligence paradigm of Perception-Decision-Control to dynamically tailor non-contact laser attacks. For the perception module, given the challenge of simulating the victim's view by full-image transformation, ELA has innovatively developed a local perspective transformation network, based on the intrinsic prior knowledge of traffic scenes and enables effective and efficient estimation. For the decision and control module, ELA trains an attack agent with data-driven reinforcement learning instead of adopting time-consuming heuristic algorithms, making it capable of instantaneously determining a valid attack strategy with the perceived information by well-designed rewards, which is then conducted by a controllable laser emitter. Experimentally, we apply our framework to diverse traffic scenarios both in the digital and physical world, verifying the effectiveness of our method under dynamic successive scenes.

Embodied Laser Attack:Leveraging Scene Priors to Achieve Agent-based Robust Non-contact Attacks

TL;DR

Abstract

Paper Structure (19 sections, 12 equations, 6 figures, 5 tables)

This paper contains 19 sections, 12 equations, 6 figures, 5 tables.

Introduction
Related Works
Robust Physical Adversarial Attacks
Attacks in Traffic Scenarios
Methodology
Problem Definition
Perception Module
Prior Knowledge in Traffic Scenarios.
Perspective Transformation with Scene-prior Knowledge.
Decision and Control Module
Basic Definition
Training Stage of Policy Network.
Experiments
Settings
Effectiveness of Active Perception
...and 4 more sections

Figures (6)

Figure 1: An overview of the ELA framework, which consists of two main modules for robust laser attacks. The attacker first captures the vehicle through a fixed sensor, and the perception module infers the object's state (location, scale, and distortion) in the victim's view based on the characteristic of shape change. Then the DC module utilizes such region information to autonomously make real-time decisions and controls the light projection hardware to achieve continuous physical attacks.
Figure 2: Left: The training and inference process of PTN. By leveraging the vehicle's position in sensor imaging, PTN could simulate the target's region in the victim's view by spatial transformation. Right: Derivation process executed by PTN.
Figure 3: Perception results with different target shapes. The green line in each image is the predicted contour obtained from PTN, where we can see that the inference coincides well with the actual outline of the ground truth.
Figure 4: Examples of ELA's attack for various scenes. Under this framework, we can first accurately estimate the imaging area of the target sign, and our agent can make effective decisions based on perceived information at different moments.
Figure 5: Demonstration of evolutionary trends of rewards and average attack steps during training of sign "Stop".
...and 1 more figures

Embodied Laser Attack:Leveraging Scene Priors to Achieve Agent-based Robust Non-contact Attacks

TL;DR

Abstract

Embodied Laser Attack:Leveraging Scene Priors to Achieve Agent-based Robust Non-contact Attacks

Authors

TL;DR

Abstract

Table of Contents

Figures (6)