Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges
Shrestha Datta, Shahriar Kabir Nahin, Anshuman Chhabra, Prasant Mohapatra
TL;DR
Agentic AI, built from LLMs with autonomy, memory, and tool use, introduces security risks that exceed traditional AI safety and software security. The paper surveys this landscape by proposing a structured threat taxonomy, reviewing defenses from design to governance, and evaluating benchmarks for safety-critical agentics, while outlining open challenges such as long-horizon security and adaptive attacks. It contributes a holistic view that links attack surfaces (prompt injection, autonomous tool abuse, multi-agent protocols) with defense strategies (prompt-resistant designs, policy enforcement, sandboxing, monitoring, and standards) and advocates for process-aware benchmarking and robust evaluation. The findings underscore the urgency of secure-by-design agentic AI and provide a roadmap for researchers and practitioners to develop resilient, auditable, and trustworthy autonomous systems.
Abstract
Agentic AI systems powered by large language models (LLMs) and endowed with planning, tool use, memory, and autonomy, are emerging as powerful, flexible platforms for automation. Their ability to autonomously execute tasks across web, software, and physical environments creates new and amplified security risks, distinct from both traditional AI safety and conventional software security. This survey outlines a taxonomy of threats specific to agentic AI, reviews recent benchmarks and evaluation methodologies, and discusses defense strategies from both technical and governance perspectives. We synthesize current research and highlight open challenges, aiming to support the development of secure-by-design agent systems.
