Table of Contents
Fetching ...

The Cognitive Firewall:Securing Browser Based AI Agents Against Indirect Prompt Injection Via Hybrid Edge Cloud Defense

Qianlong Lan, Anuj Kaul

Abstract

Deploying large language models (LLMs) as autonomous browser agents exposes a significant attack surface in the form of Indirect Prompt Injection (IPI). Cloud-based defenses can provide strong semantic analysis, but they introduce latency and raise privacy concerns. We present the Cognitive Firewall, a three-stage split-compute architecture that distributes security checks across the client and the cloud. The system consists of a local visual Sentinel, a cloud-based Deep Planner, and a deterministic Guard that enforces execution-time policies. Across 1,000 adversarial samples, edge-only defenses fail to detect 86.9% of semantic attacks. In contrast, the full hybrid architecture reduces the overall attack success rate (ASR) to below 1% (0.88% under static evaluation and 0.67% under adaptive evaluation), while maintaining deterministic constraints on side-effecting actions. By filtering presentation-layer attacks locally, the system avoids unnecessary cloud inference and achieves an approximately 17,000x latency advantage over cloud-only baselines. These results indicate that deterministic enforcement at the execution boundary can complement probabilistic language models, and that split-compute provides a practical foundation for securing interactive LLM agents.

The Cognitive Firewall:Securing Browser Based AI Agents Against Indirect Prompt Injection Via Hybrid Edge Cloud Defense

Abstract

Deploying large language models (LLMs) as autonomous browser agents exposes a significant attack surface in the form of Indirect Prompt Injection (IPI). Cloud-based defenses can provide strong semantic analysis, but they introduce latency and raise privacy concerns. We present the Cognitive Firewall, a three-stage split-compute architecture that distributes security checks across the client and the cloud. The system consists of a local visual Sentinel, a cloud-based Deep Planner, and a deterministic Guard that enforces execution-time policies. Across 1,000 adversarial samples, edge-only defenses fail to detect 86.9% of semantic attacks. In contrast, the full hybrid architecture reduces the overall attack success rate (ASR) to below 1% (0.88% under static evaluation and 0.67% under adaptive evaluation), while maintaining deterministic constraints on side-effecting actions. By filtering presentation-layer attacks locally, the system avoids unnecessary cloud inference and achieves an approximately 17,000x latency advantage over cloud-only baselines. These results indicate that deterministic enforcement at the execution boundary can complement probabilistic language models, and that split-compute provides a practical foundation for securing interactive LLM agents.
Paper Structure (34 sections, 1 equation, 3 figures, 3 tables, 1 algorithm)

This paper contains 34 sections, 1 equation, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 1: Operational sequence of the Cognitive Firewall. User inputs are first processed by the Edge Sentinel (Layer 1) to filter visual obfuscations. Only sanitized, text-based context is sent to the Cloud Planner (Layer 2) for semantic reasoning. The resulting plan is then validated by the Edge Guard (Layer 3) against a local whitelist and intent constraints before execution, yielding a fail-closed, defense-in-depth workflow.
  • Figure 2: The defense funnel (Sankey diagram). Visualization of attack filtration across $N=1000$ samples. Presentation-layer attacks are filtered by the Edge Sentinel, semantic threats are handled by the Cloud Planner, and remaining hijacking attempts are blocked by the Edge Guard. Only 0.88% of attacks bypass all layers.
  • Figure 3: System latency distribution by defense layer (log scale). The Edge Sentinel (Layer 1) runs at $\mu \approx 0.06$ ms, roughly $\sim$17,000x faster than the Cloud Planner ($\mu \approx 288$ ms).