Table of Contents
Fetching ...

Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics

Taowen Wang, Cheng Han, James Chenhao Liang, Wenhao Yang, Dongfang Liu, Luna Xinyu Zhang, Qifan Wang, Jiebo Luo, Ruixiang Tang

TL;DR

<3-5 sentence high-level summary> This paper examines the adversarial vulnerabilities of Vision-Language-Action (VLA) models used in robotics, arguing that the end-to-end, cross-modal nature of VLA systems creates new attack surfaces tied to physical dynamics and temporal action sequences. It introduces three attack objectives—Untargeted Action Discrepancy Attack (UADA), Untargeted Position-aware Attack (UPA), and Targeted Manipulation Attack (TMA)—and a patch-based attack that is effective in both digital and physical environments. A Normalized Action Discrepancy (NAD) metric is proposed to quantify fine-grained action deviations, and extensive experiments on OpenVLA/LIBERO across simulated and real-world tasks show substantial degradation in task success, with up to 100% average failure in simulation and notable transfer to real-world trials. The work highlights critical security gaps in current VLA architectures and calls for defense strategies and broader robustness testing prior to real-world deployment.

Abstract

Recently in robotics, Vision-Language-Action (VLA) models have emerged as a transformative approach, enabling robots to execute complex tasks by integrating visual and linguistic inputs within an end-to-end learning framework. Despite their significant capabilities, VLA models introduce new attack surfaces. This paper systematically evaluates their robustness. Recognizing the unique demands of robotic execution, our attack objectives target the inherent spatial and functional characteristics of robotic systems. In particular, we introduce two untargeted attack objectives that leverage spatial foundations to destabilize robotic actions, and a targeted attack objective that manipulates the robotic trajectory. Additionally, we design an adversarial patch generation approach that places a small, colorful patch within the camera's view, effectively executing the attack in both digital and physical environments. Our evaluation reveals a marked degradation in task success rates, with up to a 100\% reduction across a suite of simulated robotic tasks, highlighting critical security gaps in current VLA architectures. By unveiling these vulnerabilities and proposing actionable evaluation metrics, we advance both the understanding and enhancement of safety for VLA-based robotic systems, underscoring the necessity for continuously developing robust defense strategies prior to physical-world deployments.

Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics

TL;DR

<3-5 sentence high-level summary> This paper examines the adversarial vulnerabilities of Vision-Language-Action (VLA) models used in robotics, arguing that the end-to-end, cross-modal nature of VLA systems creates new attack surfaces tied to physical dynamics and temporal action sequences. It introduces three attack objectives—Untargeted Action Discrepancy Attack (UADA), Untargeted Position-aware Attack (UPA), and Targeted Manipulation Attack (TMA)—and a patch-based attack that is effective in both digital and physical environments. A Normalized Action Discrepancy (NAD) metric is proposed to quantify fine-grained action deviations, and extensive experiments on OpenVLA/LIBERO across simulated and real-world tasks show substantial degradation in task success, with up to 100% average failure in simulation and notable transfer to real-world trials. The work highlights critical security gaps in current VLA architectures and calls for defense strategies and broader robustness testing prior to real-world deployment.

Abstract

Recently in robotics, Vision-Language-Action (VLA) models have emerged as a transformative approach, enabling robots to execute complex tasks by integrating visual and linguistic inputs within an end-to-end learning framework. Despite their significant capabilities, VLA models introduce new attack surfaces. This paper systematically evaluates their robustness. Recognizing the unique demands of robotic execution, our attack objectives target the inherent spatial and functional characteristics of robotic systems. In particular, we introduce two untargeted attack objectives that leverage spatial foundations to destabilize robotic actions, and a targeted attack objective that manipulates the robotic trajectory. Additionally, we design an adversarial patch generation approach that places a small, colorful patch within the camera's view, effectively executing the attack in both digital and physical environments. Our evaluation reveals a marked degradation in task success rates, with up to a 100\% reduction across a suite of simulated robotic tasks, highlighting critical security gaps in current VLA architectures. By unveiling these vulnerabilities and proposing actionable evaluation metrics, we advance both the understanding and enhancement of safety for VLA-based robotic systems, underscoring the necessity for continuously developing robust defense strategies prior to physical-world deployments.

Paper Structure

This paper contains 18 sections, 10 equations, 6 figures, 2 tables, 1 algorithm.

Figures (6)

  • Figure 1: Adversarial Vulnerabilities induced by malicious manipulation. (A). Illustration of adversarial threats in robotic task execution. (B). Example of semantic-rich adversarial patches generated by proposed methods. (C). Comparison of failure rates across different attack schemes (UADA, UPA, and TMA).
  • Figure 2: Overall Adversarial Framework. The robot captures an input image, processes it through a vision-language model to generate tokens representing actions, and then uses an action de-tokenizer for discrete bin prediction. The model is optimized with adversarial objectives focusing on various discrepancies and geometries (i.e., UADA, UPA, TMA). Forward propagation is shown in black, and backpropagation is highlighted in pink. These objectives aim to maximize errors and minimize task performance, with visual emphasis on 3D-space manipulation and a focus on generating adversarial perturbation $\delta$ during task execution, such as picking up a can.
  • Figure 3: Qualitative Results of adversarial vulnerabilities over OpenVLA-7B kim24openvla and OpenVLA-7B-LIBERO kim24openvla with objectives of UADA, UPA, and TMA, respectively. We visualize the overall 3D trajectories and 2D trajectories of benign $\bullet$ and adversarial $\bullet$ scenarios at each time step to compare the impact of the generated adversarial patch in affecting them. The untargeted trajectory $\bullet$ is visualized in UADA task. All trajectories start with ▲, and we plot the success end point, marked as ★.
  • Figure 4: Qualitative Results of the physical world. The first/second row show benign and adversarial cases respectively.
  • Figure 5: Impact of Inner-loop, Patch Size and Defense Discussion. The figure shows how varying Inner-loop affects NAD in UADA, and patch sizes affect $L1$ distance and the failure rates in TMA, both targeting at DoF$_1$. (a) Impact of Inner-loop, (b) Impact of Patch Size and (c-f) the effect of four different defenses on failure rates.
  • ...and 1 more figures