Table of Contents
Fetching ...

A Survey of Privacy Attacks in Machine Learning

Maria Rigaki, Sebastian Garcia

TL;DR

This survey addresses the privacy and confidentiality challenges in machine learning by proposing a unifying threat model and taxonomy that categorize attacks into membership inference, reconstruction, property inference, and model extraction. It analyzes over 40 papers from 2014–2020, detailing attack designs, common implementation patterns (notably shadow training in centralized settings and gradient/parameter exposure in distributed settings), and defenses such as differential privacy and query-based detections. Key contributions include mapping attacker knowledge to attack outcomes, outlining causes of leaks (overfitting, model complexity, and data properties), and surveying defenses across attack types with a focus on practical utility vs privacy guarantees. The study highlights open problems, including broader coverage beyond deep learning, theoretical understanding of leakage mechanisms, realistic deployment testing, and the interplay with security, fairness, and explainability to guide future work.

Abstract

As machine learning becomes more widely used, the need to study its implications in security and privacy becomes more urgent. Although the body of work in privacy has been steadily growing over the past few years, research on the privacy aspects of machine learning has received less focus than the security aspects. Our contribution in this research is an analysis of more than 40 papers related to privacy attacks against machine learning that have been published during the past seven years. We propose an attack taxonomy, together with a threat model that allows the categorization of different attacks based on the adversarial knowledge, and the assets under attack. An initial exploration of the causes of privacy leaks is presented, as well as a detailed analysis of the different attacks. Finally, we present an overview of the most commonly proposed defenses and a discussion of the open problems and future directions identified during our analysis.

A Survey of Privacy Attacks in Machine Learning

TL;DR

This survey addresses the privacy and confidentiality challenges in machine learning by proposing a unifying threat model and taxonomy that categorize attacks into membership inference, reconstruction, property inference, and model extraction. It analyzes over 40 papers from 2014–2020, detailing attack designs, common implementation patterns (notably shadow training in centralized settings and gradient/parameter exposure in distributed settings), and defenses such as differential privacy and query-based detections. Key contributions include mapping attacker knowledge to attack outcomes, outlining causes of leaks (overfitting, model complexity, and data properties), and surveying defenses across attack types with a focus on practical utility vs privacy guarantees. The study highlights open problems, including broader coverage beyond deep learning, theoretical understanding of leakage mechanisms, realistic deployment testing, and the interplay with security, fairness, and explainability to guide future work.

Abstract

As machine learning becomes more widely used, the need to study its implications in security and privacy becomes more urgent. Although the body of work in privacy has been steadily growing over the past few years, research on the privacy aspects of machine learning has received less focus than the security aspects. Our contribution in this research is an analysis of more than 40 papers related to privacy attacks against machine learning that have been published during the past seven years. We propose an attack taxonomy, together with a threat model that allows the categorization of different attacks based on the adversarial knowledge, and the assets under attack. An initial exploration of the causes of privacy leaks is presented, as well as a detailed analysis of the different attacks. Finally, we present an overview of the most commonly proposed defenses and a discussion of the open problems and future directions identified during our analysis.

Paper Structure

This paper contains 51 sections, 9 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Threat Model of privacy and confidentiality attacks against machine learning systems. The human figure represents actors and the symbols represent the assets. Dashed lines represent data and information flow, while full lines represent possible actions. In red are the actions of the adversaries, available under the threat model.
  • Figure 2: Threat model in a collaborative learning setting. Dashed lines represent data and information flows, while full lines represent possible actions. In red are the actions of the adversaries, available under the threat model. In this setting the adversary can be placed either at the parameter server or locally. Model consumers are not depicted for reasons of simplicity. In a federated learning setting, local model owners are also model consumers.
  • Figure 3: Shadow training architecture. At first, a number of shadow models are trained with their respective shadow datasets in order to emulate the behavior of the target model. At the second stage, a meta-model is being trained from the outputs of the shadow models and the known labels of the shadow datasets. The meta-model is used to infer membership or properties of data or the model given the output of the target model.
  • Figure 4: Map of attack types per algorithm. The list of algorithm presented is not exhaustive but indicative. Underneath each algorithm or area of machine learning there is an indication of the attacks that have been studied so far. A red box indicates no attack.
  • Figure 5: Number of papers used against each learning task and attack type. Classification includes both binary and multi-class classification. Darker gray means higher number of papers.
  • ...and 1 more figures

Theorems & Definitions (2)

  • definition 1: ($\epsilon,\delta$)-Differential Privacy
  • definition 2