Privacy in Deep Learning: A Survey
Fatemehsadat Mireshghallah, Mohammadkazem Taram, Praneeth Vepakomma, Abhishek Singh, Ramesh Raskar, Hadi Esmaeilzadeh
TL;DR
Privacy in Deep Learning: A Survey analyzes privacy threats in DL arising from sensitive training data and model parameters, distinguishing direct exposure from indirect inference attacks. It surveys privacy-preserving mechanisms across data aggregation, training-time defenses (DP, HE, SMC), and inference-time protections, while highlighting a gap in test-time inference privacy. It also discusses privacy-enhancing execution models such as Federated Learning, Split Learning, and TEEs, and discusses practical trade-offs in utility and performance. The paper emphasizes the need for future research on test-time inference privacy and more holistic privacy assurances in DL deployments.
Abstract
The ever-growing advances of deep learning in many areas including vision, recommendation systems, natural language processing, etc., have led to the adoption of Deep Neural Networks (DNNs) in production systems. The availability of large datasets and high computational power are the main contributors to these advances. The datasets are usually crowdsourced and may contain sensitive information. This poses serious privacy concerns as this data can be misused or leaked through various vulnerabilities. Even if the cloud provider and the communication link is trusted, there are still threats of inference attacks where an attacker could speculate properties of the data used for training, or find the underlying model architecture and parameters. In this survey, we review the privacy concerns brought by deep learning, and the mitigating techniques introduced to tackle these issues. We also show that there is a gap in the literature regarding test-time inference privacy, and propose possible future research directions.
