Table of Contents
Fetching ...

Unique Security and Privacy Threats of Large Language Models: A Comprehensive Survey

Shang Wang, Tianqing Zhu, Bo Liu, Ming Ding, Dayong Ye, Wanlei Zhou, Philip S. Yu

TL;DR

This survey introduces a four-scenario, life-cycle taxonomy to analyze privacy and security threats unique to large language models (LLMs): pre-training, fine-tuning, deployment, and LLM-based agents. It differentiates LLM-specific risks from traditional model threats and provides per-scenario threat models, concrete examples, and a suite of countermeasures, including privacy-preserving training, backdoor defenses, prompt-design safeguards, and multi-agent governance. The work also expands the discussion to machine unlearning and watermarking as additional defensive angles, and offers a forward-looking view on robust, accountable LLM systems. Overall, the paper equips researchers and practitioners with a structured framework to assess and mitigate risks, enabling safer deployment of LLM-based technologies across domains.

Abstract

With the rapid development of artificial intelligence, large language models (LLMs) have made remarkable advancements in natural language processing. These models are trained on vast datasets to exhibit powerful language understanding and generation capabilities across various applications, including chatbots, and agents. However, LLMs have revealed a variety of privacy and security issues throughout their life cycle, drawing significant academic and industrial attention. Moreover, the risks faced by LLMs differ significantly from those encountered by traditional language models. Given that current surveys lack a clear taxonomy of unique threat models across diverse scenarios, we emphasize the unique privacy and security threats associated with four specific scenarios: pre-training, fine-tuning, deployment, and LLM-based agents. Addressing the characteristics of each risk, this survey outlines and analyzes potential countermeasures. Research on attack and defense situations can offer feasible research directions, enabling more areas to benefit from LLMs.

Unique Security and Privacy Threats of Large Language Models: A Comprehensive Survey

TL;DR

This survey introduces a four-scenario, life-cycle taxonomy to analyze privacy and security threats unique to large language models (LLMs): pre-training, fine-tuning, deployment, and LLM-based agents. It differentiates LLM-specific risks from traditional model threats and provides per-scenario threat models, concrete examples, and a suite of countermeasures, including privacy-preserving training, backdoor defenses, prompt-design safeguards, and multi-agent governance. The work also expands the discussion to machine unlearning and watermarking as additional defensive angles, and offers a forward-looking view on robust, accountable LLM systems. Overall, the paper equips researchers and practitioners with a structured framework to assess and mitigate risks, enabling safer deployment of LLM-based technologies across domains.

Abstract

With the rapid development of artificial intelligence, large language models (LLMs) have made remarkable advancements in natural language processing. These models are trained on vast datasets to exhibit powerful language understanding and generation capabilities across various applications, including chatbots, and agents. However, LLMs have revealed a variety of privacy and security issues throughout their life cycle, drawing significant academic and industrial attention. Moreover, the risks faced by LLMs differ significantly from those encountered by traditional language models. Given that current surveys lack a clear taxonomy of unique threat models across diverse scenarios, we emphasize the unique privacy and security threats associated with four specific scenarios: pre-training, fine-tuning, deployment, and LLM-based agents. Addressing the characteristics of each risk, this survey outlines and analyzes potential countermeasures. Research on attack and defense situations can offer feasible research directions, enabling more areas to benefit from LLMs.
Paper Structure (45 sections, 12 figures, 9 tables)

This paper contains 45 sections, 12 figures, 9 tables.

Figures (12)

  • Figure 1: The pipeline of our survey. For each threat scenario, the first column lists the data type used, and the second column describes the process applied. The text boxes indicate unique data types and processes of LLMs. The fourth and fifth columns detail the corresponding risks and countermeasures. Notably, Underlined texts represent unique risks in LLMs.
  • Figure 2: The three threat models in pre-training LLMs, where malicious entities include data contributors, upstream and downstream developers.
  • Figure 3: The threat models in fine-tuning LLMs, where malicious entities include contributors and third parties.
  • Figure 4: The detail of poisoning instruction tuning, where a malicious third-party is a strong adversary.
  • Figure 5: The detail of poisoning alignment tuning, where malicious third-party is a strong adversary.
  • ...and 7 more figures