Table of Contents
Fetching ...

Navigating the Risks: A Survey of Security, Privacy, and Ethics Threats in LLM-Based Agents

Yuyou Gan, Yong Yang, Zhe Ma, Ping He, Rui Zeng, Yiming Wang, Qingming Li, Chunyi Zhou, Songze Li, Ting Wang, Yunjun Gao, Yingcai Wu, Shouling Ji

TL;DR

This paper introduces a novel threat taxonomy for LLM-based agents by classifying attacks according to their sources (inputs, model, or both) and their impacts (security/safety, privacy, ethics). It surveys six agent features (LLM-based controller, multimodal inputs/outputs, multi-source inputs, multi-round interaction, memory, and tool invocation) and analyzes risks across problematic inputs, model flaws, and input–model interactions. Four real-world case studies (WebGPT, Voyager, PReP, ChatDev) illustrate how these threats manifest in practice and how agent design choices shape risk exposure. The authors propose data, methodological, and policy directions to advance risk assessment, mitigation techniques, and regulatory frameworks for safer LLM-based agents. Overall, the work provides a comprehensive, cross-modal, cross-system view of emerging security, privacy, and ethics challenges in autonomous AI agents and offers concrete avenues for future research and governance.

Abstract

With the continuous development of large language models (LLMs), transformer-based models have made groundbreaking advances in numerous natural language processing (NLP) tasks, leading to the emergence of a series of agents that use LLMs as their control hub. While LLMs have achieved success in various tasks, they face numerous security and privacy threats, which become even more severe in the agent scenarios. To enhance the reliability of LLM-based applications, a range of research has emerged to assess and mitigate these risks from different perspectives. To help researchers gain a comprehensive understanding of various risks, this survey collects and analyzes the different threats faced by these agents. To address the challenges posed by previous taxonomies in handling cross-module and cross-stage threats, we propose a novel taxonomy framework based on the sources and impacts. Additionally, we identify six key features of LLM-based agents, based on which we summarize the current research progress and analyze their limitations. Subsequently, we select four representative agents as case studies to analyze the risks they may face in practical use. Finally, based on the aforementioned analyses, we propose future research directions from the perspectives of data, methodology, and policy, respectively.

Navigating the Risks: A Survey of Security, Privacy, and Ethics Threats in LLM-Based Agents

TL;DR

This paper introduces a novel threat taxonomy for LLM-based agents by classifying attacks according to their sources (inputs, model, or both) and their impacts (security/safety, privacy, ethics). It surveys six agent features (LLM-based controller, multimodal inputs/outputs, multi-source inputs, multi-round interaction, memory, and tool invocation) and analyzes risks across problematic inputs, model flaws, and input–model interactions. Four real-world case studies (WebGPT, Voyager, PReP, ChatDev) illustrate how these threats manifest in practice and how agent design choices shape risk exposure. The authors propose data, methodological, and policy directions to advance risk assessment, mitigation techniques, and regulatory frameworks for safer LLM-based agents. Overall, the work provides a comprehensive, cross-modal, cross-system view of emerging security, privacy, and ethics challenges in autonomous AI agents and offers concrete avenues for future research and governance.

Abstract

With the continuous development of large language models (LLMs), transformer-based models have made groundbreaking advances in numerous natural language processing (NLP) tasks, leading to the emergence of a series of agents that use LLMs as their control hub. While LLMs have achieved success in various tasks, they face numerous security and privacy threats, which become even more severe in the agent scenarios. To enhance the reliability of LLM-based applications, a range of research has emerged to assess and mitigate these risks from different perspectives. To help researchers gain a comprehensive understanding of various risks, this survey collects and analyzes the different threats faced by these agents. To address the challenges posed by previous taxonomies in handling cross-module and cross-stage threats, we propose a novel taxonomy framework based on the sources and impacts. Additionally, we identify six key features of LLM-based agents, based on which we summarize the current research progress and analyze their limitations. Subsequently, we select four representative agents as case studies to analyze the risks they may face in practical use. Finally, based on the aforementioned analyses, we propose future research directions from the perspectives of data, methodology, and policy, respectively.

Paper Structure

This paper contains 44 sections, 2 equations, 12 figures, 5 tables.

Figures (12)

  • Figure 1: The overall framework of our taxonomy for the risks of LLM-based agents.
  • Figure 2: An overall framework of LLM-based agents.
  • Figure 3: Six key features of LLM-based agents: LLM-based controller, multi-modal inputs and outputs, multi-source inputs, multi-round interaction, memory mechanism and tool invocation.
  • Figure 4: The mapping of key features to identified threats based on collected literature.
  • Figure 5: Adversarial examples targeting LLM-based agents may involve four key features (indicated with a red exclamation mark), leading to incorrect output.
  • ...and 7 more figures