Table of Contents
Fetching ...

AdInject: Real-World Black-Box Attacks on Web Agents via Advertising Delivery

Haowei Wang, Junjie Wang, Xiaojun Jia, Rupeng Zhang, Mingyang Li, Zhe Liu, Yang Liu, Qing Wang

TL;DR

AdInject presents a realistic black-box attack against Vision-Language Model–based Web Agents by exploiting internet advertising delivery to inject malicious content into web pages. The method combines static ad content design with a VLM-driven optimization step that infers user intents from a target site’s context, increasing the perceived relevance of the malicious click. Experimental results on VisualWebArena and OSWorld show high attack success rates, with $ASR$ often exceeding $60\%$ and approaching $100\%$ in favorable settings, and demonstrate that ad content optimization further boosts effectiveness. The work highlights a critical, real-world vulnerability in autonomous web agents and emphasizes the urgent need for defense mechanisms to mitigate environment-manipulation attacks in deployment.</p>

Abstract

Vision-Language Model (VLM) based Web Agents represent a significant step towards automating complex tasks by simulating human-like interaction with websites. However, their deployment in uncontrolled web environments introduces significant security vulnerabilities. Existing research on adversarial environmental injection attacks often relies on unrealistic assumptions, such as direct HTML manipulation, knowledge of user intent, or access to agent model parameters, limiting their practical applicability. In this paper, we propose AdInject, a novel and real-world black-box attack method that leverages the internet advertising delivery to inject malicious content into the Web Agent's environment. AdInject operates under a significantly more realistic threat model than prior work, assuming a black-box agent, static malicious content constraints, and no specific knowledge of user intent. AdInject includes strategies for designing malicious ad content aimed at misleading agents into clicking, and a VLM-based ad content optimization technique that infers potential user intents from the target website's context and integrates these intents into the ad content to make it appear more relevant or critical to the agent's task, thus enhancing attack effectiveness. Experimental evaluations demonstrate the effectiveness of AdInject, attack success rates exceeding 60% in most scenarios and approaching 100% in certain cases. This strongly demonstrates that prevalent advertising delivery constitutes a potent and real-world vector for environment injection attacks against Web Agents. This work highlights a critical vulnerability in Web Agent security arising from real-world environment manipulation channels, underscoring the urgent need for developing robust defense mechanisms against such threats. Our code is available at https://github.com/NicerWang/AdInject.

AdInject: Real-World Black-Box Attacks on Web Agents via Advertising Delivery

TL;DR

AdInject presents a realistic black-box attack against Vision-Language Model–based Web Agents by exploiting internet advertising delivery to inject malicious content into web pages. The method combines static ad content design with a VLM-driven optimization step that infers user intents from a target site’s context, increasing the perceived relevance of the malicious click. Experimental results on VisualWebArena and OSWorld show high attack success rates, with often exceeding and approaching in favorable settings, and demonstrate that ad content optimization further boosts effectiveness. The work highlights a critical, real-world vulnerability in autonomous web agents and emphasizes the urgent need for defense mechanisms to mitigate environment-manipulation attacks in deployment.</p>

Abstract

Vision-Language Model (VLM) based Web Agents represent a significant step towards automating complex tasks by simulating human-like interaction with websites. However, their deployment in uncontrolled web environments introduces significant security vulnerabilities. Existing research on adversarial environmental injection attacks often relies on unrealistic assumptions, such as direct HTML manipulation, knowledge of user intent, or access to agent model parameters, limiting their practical applicability. In this paper, we propose AdInject, a novel and real-world black-box attack method that leverages the internet advertising delivery to inject malicious content into the Web Agent's environment. AdInject operates under a significantly more realistic threat model than prior work, assuming a black-box agent, static malicious content constraints, and no specific knowledge of user intent. AdInject includes strategies for designing malicious ad content aimed at misleading agents into clicking, and a VLM-based ad content optimization technique that infers potential user intents from the target website's context and integrates these intents into the ad content to make it appear more relevant or critical to the agent's task, thus enhancing attack effectiveness. Experimental evaluations demonstrate the effectiveness of AdInject, attack success rates exceeding 60% in most scenarios and approaching 100% in certain cases. This strongly demonstrates that prevalent advertising delivery constitutes a potent and real-world vector for environment injection attacks against Web Agents. This work highlights a critical vulnerability in Web Agent security arising from real-world environment manipulation channels, underscoring the urgent need for developing robust defense mechanisms against such threats. Our code is available at https://github.com/NicerWang/AdInject.

Paper Structure

This paper contains 34 sections, 3 equations, 3 figures, 9 tables.

Figures (3)

  • Figure 1: Demonstration of AdInject
  • Figure 2: Demonstration of Ad Content Optimization
  • Figure 3: Part of Advertisement Styles