Atomicity for Agents: Exposing, Exploiting, and Mitigating TOCTOU Vulnerabilities in Browser-Use Agents

Linxi Jiang; Zhijie Liu; Haotian Luo; Zhiqiang Lin

Atomicity for Agents: Exposing, Exploiting, and Mitigating TOCTOU Vulnerabilities in Browser-Use Agents

Linxi Jiang, Zhijie Liu, Haotian Luo, Zhiqiang Lin

TL;DR

A lightweight mitigation based on pre-execution validation that monitors DOM and layout changes during planning and validates the page state immediately before action execution reduces the risk of insecure execution and mitigates unintended side effects in browser-use agents

Abstract

Browser-use agents are widely used for everyday tasks. They enable automated interaction with web pages through structured DOM based interfaces or vision language models operating on page screenshots. However, web pages often change between planning and execution, causing agents to execute actions based on stale assumptions. We view this temporal mismatch as a time of check to time of use (TOCTOU) vulnerability in browser-use agents. Dynamic or adversarial web content can exploit this window to induce unintended actions. We present a large scale empirical study of TOCTOU vulnerabilities in browser-use agents using a benchmark that spans synthesized and real world websites. Using this benchmark, we evaluate 10 popular open source agents and show that TOCTOU vulnerabilities are widespread. We design a lightweight mitigation based on pre-execution validation. It monitors DOM and layout changes during planning and validates the page state immediately before action execution. This approach reduces the risk of insecure execution and mitigates unintended side effects in browser-use agents.

Atomicity for Agents: Exposing, Exploiting, and Mitigating TOCTOU Vulnerabilities in Browser-Use Agents

TL;DR

Abstract

Paper Structure (54 sections, 3 equations, 8 figures, 3 tables, 1 algorithm)

This paper contains 54 sections, 3 equations, 8 figures, 3 tables, 1 algorithm.

Introduction
Contributions.
Background
Browser-Use and Web Agents
Observation spaces.
Action spaces.
Interaction loop.
TOCTOU in Operating Systems
The TOCTOU race.
Mitigation patterns.
Exposing TOCTOU Vulnerabilities
Vulnerability Definition
Agent loop and notation.
TOCTOU window.
Vulnerability condition.
...and 39 more sections

Figures (8)

Figure 1: A real-world TOCTOU example on the Forbes homepage. The green region indicates the intended target area at $t_1$. A delayed advertisement overlay (red region) appears at $t_2$ and overlaps the target, so a subsequent click at $t_3$ can become an unintended ad click that redirects to an advertisement page.
Figure 2: A TOCTOU window in the browser-use agent loop. The agent selects $a_{\text{plan}}$ from $o_{t_{\text{plan}}}$, but the page changes before $t_{\text{act}}$, so $a_{\text{plan}}$ may apply to a different target.
Figure 3: A DynWeb instance for Type I (UI changes). An adversary-controlled origin injects a delayed overlay between check time ($t_1$) and use time ($t_3$), causing the agent's click to resolve to an unintended control and redirecting it into an adversary-chosen flow.
Figure 4: Mitigation Framework. The agent plans actions while monitoring DOM and layout changes, and execution proceeds only if validation confirms stability.
Figure 5: Trigger ratio of TOCTOU vulnerabilities across three manipulation types. Here, n counts the number of cases per type, including both synthesized scenarios and real-world websites.
...and 3 more figures

Atomicity for Agents: Exposing, Exploiting, and Mitigating TOCTOU Vulnerabilities in Browser-Use Agents

TL;DR

Abstract

Atomicity for Agents: Exposing, Exploiting, and Mitigating TOCTOU Vulnerabilities in Browser-Use Agents

Authors

TL;DR

Abstract

Table of Contents

Figures (8)