Table of Contents
Fetching ...

AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways

Zehang Deng, Yongjian Guo, Changzhou Han, Wanlun Ma, Junwu Xiong, Sheng Wen, Yang Xiang

TL;DR

The paper addresses the security challenges of AI agents by organizing threats around four knowledge gaps: unpredictability of multi-step user inputs, complexity of internal executions, variability of operational environments, and interactions with untrusted external entities. It offers a comprehensive taxonomy of intra-execution and interaction threats, covering perception, brain, action, environment, agent interactions, and memory, with concrete defense concepts such as sandboxing, guardrails, memory management, and auditing. By surveying 100+ papers from AI and cybersecurity venues, it outlines practical mitigation strategies and identifies key open problems, including robust input inspection, memory-safe architectures, and principled evaluation baselines. The work aims to guide researchers and practitioners toward more trustworthy, robust AI-agent deployments across domains, including critical applications where security and safety are paramount.

Abstract

An Artificial Intelligence (AI) agent is a software entity that autonomously performs tasks or makes decisions based on pre-defined objectives and data inputs. AI agents, capable of perceiving user inputs, reasoning and planning tasks, and executing actions, have seen remarkable advancements in algorithm development and task performance. However, the security challenges they pose remain under-explored and unresolved. This survey delves into the emerging security threats faced by AI agents, categorizing them into four critical knowledge gaps: unpredictability of multi-step user inputs, complexity in internal executions, variability of operational environments, and interactions with untrusted external entities. By systematically reviewing these threats, this paper highlights both the progress made and the existing limitations in safeguarding AI agents. The insights provided aim to inspire further research into addressing the security threats associated with AI agents, thereby fostering the development of more robust and secure AI agent applications.

AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways

TL;DR

The paper addresses the security challenges of AI agents by organizing threats around four knowledge gaps: unpredictability of multi-step user inputs, complexity of internal executions, variability of operational environments, and interactions with untrusted external entities. It offers a comprehensive taxonomy of intra-execution and interaction threats, covering perception, brain, action, environment, agent interactions, and memory, with concrete defense concepts such as sandboxing, guardrails, memory management, and auditing. By surveying 100+ papers from AI and cybersecurity venues, it outlines practical mitigation strategies and identifies key open problems, including robust input inspection, memory-safe architectures, and principled evaluation baselines. The work aims to guide researchers and practitioners toward more trustworthy, robust AI-agent deployments across domains, including critical applications where security and safety are paramount.

Abstract

An Artificial Intelligence (AI) agent is a software entity that autonomously performs tasks or makes decisions based on pre-defined objectives and data inputs. AI agents, capable of perceiving user inputs, reasoning and planning tasks, and executing actions, have seen remarkable advancements in algorithm development and task performance. However, the security challenges they pose remain under-explored and unresolved. This survey delves into the emerging security threats faced by AI agents, categorizing them into four critical knowledge gaps: unpredictability of multi-step user inputs, complexity in internal executions, variability of operational environments, and interactions with untrusted external entities. By systematically reviewing these threats, this paper highlights both the progress made and the existing limitations in safeguarding AI agents. The insights provided aim to inspire further research into addressing the security threats associated with AI agents, thereby fostering the development of more robust and secure AI agent applications.
Paper Structure (31 sections, 4 figures, 1 table)

This paper contains 31 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: Illustration of knowledge gaps in AI agent security. These knowledge gaps increase the security challenges of AI agents. Specifically, Gap 1 is associated with Threats on Perception (§\ref{['perception']}), Gap 2 is linked with Threats on Brain (§\ref{['brain']}) and Threats on Action (§\ref{['action']}). Gap 3 is related to Threats on Agent2Environment (§\ref{['environment']}), and Gap 4 concerns with Threats on Agent2Agent (§\ref{['outside_agents']}) and Threats on Memory (§\ref{['memory_interaction']}).
  • Figure 2: General workflow of AI agent. Typically, an AI agent consists of three components: perception, brain, and action.
  • Figure 3: Taxonomy of the literature on AI agent security.
  • Figure 4: Illustration of Future Directions