Towards the Science of Security and Privacy in Machine Learning
Nicolas Papernot, Patrick McDaniel, Arunesh Sinha, Michael Wellman
TL;DR
The paper tackles the fragmented understanding of security and privacy in machine learning by proposing a unified threat model that spans the entire ML data pipeline and aligns attacks/defenses with CIA and privacy principles. It formalizes ML learning through the PAC framework and analyzes training-time poisoning, inference-time adversaries (white-box and black-box), and privacy/ fairness considerations. Key contributions include a structured taxonomy of attacks and defenses, links between distribution drift and robustness, and a no free lunch theorem illustrating fundamental accuracy-resilience trade-offs. The work highlights the need to calibrate model complexity, data availability, and defense strategies to environment-specific risks, laying groundwork for robust, private, and accountable ML systems.
Abstract
Advances in machine learning (ML) in recent years have enabled a dizzying array of applications such as data analytics, autonomous systems, and security diagnostics. ML is now pervasive---new systems and models are being deployed in every domain imaginable, leading to rapid and widespread deployment of software based inference and decision making. There is growing recognition that ML exposes new vulnerabilities in software systems, yet the technical community's understanding of the nature and extent of these vulnerabilities remains limited. We systematize recent findings on ML security and privacy, focusing on attacks identified on these systems and defenses crafted to date. We articulate a comprehensive threat model for ML, and categorize attacks and defenses within an adversarial framework. Key insights resulting from works both in the ML and security communities are identified and the effectiveness of approaches are related to structural elements of ML algorithms and the data used to train them. We conclude by formally exploring the opposing relationship between model accuracy and resilience to adversarial manipulation. Through these explorations, we show that there are (possibly unavoidable) tensions between model complexity, accuracy, and resilience that must be calibrated for the environments in which they will be used.
