A Survey of Privacy Attacks in Machine Learning
Maria Rigaki, Sebastian Garcia
TL;DR
This survey addresses the privacy and confidentiality challenges in machine learning by proposing a unifying threat model and taxonomy that categorize attacks into membership inference, reconstruction, property inference, and model extraction. It analyzes over 40 papers from 2014–2020, detailing attack designs, common implementation patterns (notably shadow training in centralized settings and gradient/parameter exposure in distributed settings), and defenses such as differential privacy and query-based detections. Key contributions include mapping attacker knowledge to attack outcomes, outlining causes of leaks (overfitting, model complexity, and data properties), and surveying defenses across attack types with a focus on practical utility vs privacy guarantees. The study highlights open problems, including broader coverage beyond deep learning, theoretical understanding of leakage mechanisms, realistic deployment testing, and the interplay with security, fairness, and explainability to guide future work.
Abstract
As machine learning becomes more widely used, the need to study its implications in security and privacy becomes more urgent. Although the body of work in privacy has been steadily growing over the past few years, research on the privacy aspects of machine learning has received less focus than the security aspects. Our contribution in this research is an analysis of more than 40 papers related to privacy attacks against machine learning that have been published during the past seven years. We propose an attack taxonomy, together with a threat model that allows the categorization of different attacks based on the adversarial knowledge, and the assets under attack. An initial exploration of the causes of privacy leaks is presented, as well as a detailed analysis of the different attacks. Finally, we present an overview of the most commonly proposed defenses and a discussion of the open problems and future directions identified during our analysis.
