Table of Contents
Fetching ...

Data Poisoning in Deep Learning: A Survey

Pinlong Zhao, Weiyao Zhu, Pengfei Jiao, Di Gao, Ou Wu

TL;DR

This survey provides a focused, DL centric examination of data poisoning attacks, distinguishing from broader adversarial ML surveys. It develops a two dimensional taxonomy (attack characteristics and algorithmic principles) and details seven attack dimensions, plus a comprehensive review of data poisoning algorithms (heuristic, label flipping, feature space, bilevel, influence, generative, others). It extends the discussion to large language models, detailing poisoning across pre training, fine tuning, RLHF preference alignment, instruction tuning, PEFT based methods, ICL, and prompt injection. It discusses open challenges, future research directions, and provides an online resource library. Overall, it aims to guide researchers in understanding and advancing poisoning attack methodologies in DL and LLM contexts.

Abstract

Deep learning has become a cornerstone of modern artificial intelligence, enabling transformative applications across a wide range of domains. As the core element of deep learning, the quality and security of training data critically influence model performance and reliability. However, during the training process, deep learning models face the significant threat of data poisoning, where attackers introduce maliciously manipulated training data to degrade model accuracy or lead to anomalous behavior. While existing surveys provide valuable insights into data poisoning, they generally adopt a broad perspective, encompassing both attacks and defenses, but lack a dedicated, in-depth analysis of poisoning attacks specifically in deep learning. In this survey, we bridge this gap by presenting a comprehensive and targeted review of data poisoning in deep learning. First, this survey categorizes data poisoning attacks across multiple perspectives, providing an in-depth analysis of their characteristics and underlying design princinples. Second, the discussion is extended to the emerging area of data poisoning in large language models(LLMs). Finally, we explore critical open challenges in the field and propose potential research directions to advance the field further. To support further exploration, an up-to-date repository of resources on data poisoning in deep learning is available at https://github.com/Pinlong-Zhao/Data-Poisoning.

Data Poisoning in Deep Learning: A Survey

TL;DR

This survey provides a focused, DL centric examination of data poisoning attacks, distinguishing from broader adversarial ML surveys. It develops a two dimensional taxonomy (attack characteristics and algorithmic principles) and details seven attack dimensions, plus a comprehensive review of data poisoning algorithms (heuristic, label flipping, feature space, bilevel, influence, generative, others). It extends the discussion to large language models, detailing poisoning across pre training, fine tuning, RLHF preference alignment, instruction tuning, PEFT based methods, ICL, and prompt injection. It discusses open challenges, future research directions, and provides an online resource library. Overall, it aims to guide researchers in understanding and advancing poisoning attack methodologies in DL and LLM contexts.

Abstract

Deep learning has become a cornerstone of modern artificial intelligence, enabling transformative applications across a wide range of domains. As the core element of deep learning, the quality and security of training data critically influence model performance and reliability. However, during the training process, deep learning models face the significant threat of data poisoning, where attackers introduce maliciously manipulated training data to degrade model accuracy or lead to anomalous behavior. While existing surveys provide valuable insights into data poisoning, they generally adopt a broad perspective, encompassing both attacks and defenses, but lack a dedicated, in-depth analysis of poisoning attacks specifically in deep learning. In this survey, we bridge this gap by presenting a comprehensive and targeted review of data poisoning in deep learning. First, this survey categorizes data poisoning attacks across multiple perspectives, providing an in-depth analysis of their characteristics and underlying design princinples. Second, the discussion is extended to the emerging area of data poisoning in large language models(LLMs). Finally, we explore critical open challenges in the field and propose potential research directions to advance the field further. To support further exploration, an up-to-date repository of resources on data poisoning in deep learning is available at https://github.com/Pinlong-Zhao/Data-Poisoning.

Paper Structure

This paper contains 38 sections, 8 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Types of Adversarial Attacks.
  • Figure 2: Deep learning and data poisoning attack pipeline.
  • Figure 3: Taxonomy of data poisoning attacks in deep learning with representative examples.
  • Figure 4: The sub-taxonomy of data poisoning algorithms.
  • Figure 5: Data Poisoning In LLMs.