Table of Contents
Fetching ...

EVA: Red-Teaming GUI Agents via Evolving Indirect Prompt Injection

Yijie Lu, Tianjie Ju, Manman Zhao, Xinbei Ma, Yuan Guo, ZhuoSheng Zhang

TL;DR

This paper tackles the vulnerability of GUI agents to indirect prompt injection by introducing EVA, a feedback-driven red-teaming framework that evolves visual injections in a black-box setting. EVA continuously analyzes the agent’s attention and task responses to adapt prompt injections (pop-ups, chat prompts, payment dialogs, and emails), achieving higher attack success rates and transferability than static baselines. Through experiments on six GUI agents and four realistic scenarios, EVA demonstrates transferable threat patterns and attention-dependent weaknesses, underscoring the need for attention-aware defenses in multimodal systems. The work also provides a reproducible evaluation pipeline and broad insights into how visual layout and concise, high-impact phrases can steer agent behavior, informing future security evaluations and defense strategies.

Abstract

As multimodal agents are increasingly trained to operate graphical user interfaces (GUIs) to complete user tasks, they face a growing threat from indirect prompt injection, attacks in which misleading instructions are embedded into the agent's visual environment, such as popups or chat messages, and misinterpreted as part of the intended task. A typical example is environmental injection, in which GUI elements are manipulated to influence agent behavior without directly modifying the user prompt. To address these emerging attacks, we propose EVA, a red teaming framework for indirect prompt injection which transforms the attack into a closed loop optimization by continuously monitoring an agent's attention distribution over the GUI and updating adversarial cues, keywords, phrasing, and layout, in response. Compared with prior one shot methods that generate fixed prompts without regard for how the model allocates visual attention, EVA dynamically adapts to emerging attention hotspots, yielding substantially higher attack success rates and far greater transferability across diverse GUI scenarios. We evaluate EVA on six widely used generalist and specialist GUI agents in realistic settings such as popup manipulation, chat based phishing, payments, and email composition. Experimental results show that EVA substantially improves success rates over static baselines. Under goal agnostic constraints, where the attacker does not know the agent's task intent, EVA still discovers effective patterns. Notably, we find that injection styles transfer well across models, revealing shared behavioral biases in GUI agents. These results suggest that evolving indirect prompt injection is a powerful tool not only for red teaming agents, but also for uncovering common vulnerabilities in their multimodal decision making.

EVA: Red-Teaming GUI Agents via Evolving Indirect Prompt Injection

TL;DR

This paper tackles the vulnerability of GUI agents to indirect prompt injection by introducing EVA, a feedback-driven red-teaming framework that evolves visual injections in a black-box setting. EVA continuously analyzes the agent’s attention and task responses to adapt prompt injections (pop-ups, chat prompts, payment dialogs, and emails), achieving higher attack success rates and transferability than static baselines. Through experiments on six GUI agents and four realistic scenarios, EVA demonstrates transferable threat patterns and attention-dependent weaknesses, underscoring the need for attention-aware defenses in multimodal systems. The work also provides a reproducible evaluation pipeline and broad insights into how visual layout and concise, high-impact phrases can steer agent behavior, informing future security evaluations and defense strategies.

Abstract

As multimodal agents are increasingly trained to operate graphical user interfaces (GUIs) to complete user tasks, they face a growing threat from indirect prompt injection, attacks in which misleading instructions are embedded into the agent's visual environment, such as popups or chat messages, and misinterpreted as part of the intended task. A typical example is environmental injection, in which GUI elements are manipulated to influence agent behavior without directly modifying the user prompt. To address these emerging attacks, we propose EVA, a red teaming framework for indirect prompt injection which transforms the attack into a closed loop optimization by continuously monitoring an agent's attention distribution over the GUI and updating adversarial cues, keywords, phrasing, and layout, in response. Compared with prior one shot methods that generate fixed prompts without regard for how the model allocates visual attention, EVA dynamically adapts to emerging attention hotspots, yielding substantially higher attack success rates and far greater transferability across diverse GUI scenarios. We evaluate EVA on six widely used generalist and specialist GUI agents in realistic settings such as popup manipulation, chat based phishing, payments, and email composition. Experimental results show that EVA substantially improves success rates over static baselines. Under goal agnostic constraints, where the attacker does not know the agent's task intent, EVA still discovers effective patterns. Notably, we find that injection styles transfer well across models, revealing shared behavioral biases in GUI agents. These results suggest that evolving indirect prompt injection is a powerful tool not only for red teaming agents, but also for uncovering common vulnerabilities in their multimodal decision making.

Paper Structure

This paper contains 39 sections, 5 equations, 9 figures, 7 tables.

Figures (9)

  • Figure 1: Illustration of a GUI agent’s behavior under clean vs. poisoned environments. In a clean interface (left), the agent correctly follows the user’s goal (e.g., searching for "iPhone 16"). In contrast, when a misleading pop-up is injected (right), the same agent is distracted by visually salient content and clicks the injected button instead.
  • Figure 2: Comparison of static ("Ice") vs. evolving ("Fire") environmental injection strategies. Left: Static methods generate fixed poisoned environments based on pre-defined prompts, ignoring agent feedback. Right: EVA leverages an optimization loop that adjusts injection content through real-time interaction, adapting to model-specific vulnerabilities.
  • Figure 3: Overview of the EVA framework. The system evolves adversarial injections through iterative feedback from GUI agent behaviors. Word-level patterns are reinforced based on success and guide the next generation of prompts.
  • Figure 4: Comparison of attention distribution in two attack scenarios.Left: Pop-up-based injection draws sharply localized attention to the confirm button, leading to successful hijack. Right: Chat-based link injection results in broadly dispersed attention across the chat interface, reducing the chance of misdirection.
  • Figure 5: Representative indirect injection strategies visualized. All attacks are seamlessly rendered as part of the interface and captured in the screenshot consumed by the agent.
  • ...and 4 more figures