Table of Contents
Fetching ...

Privacy in Deep Learning: A Survey

Fatemehsadat Mireshghallah, Mohammadkazem Taram, Praneeth Vepakomma, Abhishek Singh, Ramesh Raskar, Hadi Esmaeilzadeh

TL;DR

Privacy in Deep Learning: A Survey analyzes privacy threats in DL arising from sensitive training data and model parameters, distinguishing direct exposure from indirect inference attacks. It surveys privacy-preserving mechanisms across data aggregation, training-time defenses (DP, HE, SMC), and inference-time protections, while highlighting a gap in test-time inference privacy. It also discusses privacy-enhancing execution models such as Federated Learning, Split Learning, and TEEs, and discusses practical trade-offs in utility and performance. The paper emphasizes the need for future research on test-time inference privacy and more holistic privacy assurances in DL deployments.

Abstract

The ever-growing advances of deep learning in many areas including vision, recommendation systems, natural language processing, etc., have led to the adoption of Deep Neural Networks (DNNs) in production systems. The availability of large datasets and high computational power are the main contributors to these advances. The datasets are usually crowdsourced and may contain sensitive information. This poses serious privacy concerns as this data can be misused or leaked through various vulnerabilities. Even if the cloud provider and the communication link is trusted, there are still threats of inference attacks where an attacker could speculate properties of the data used for training, or find the underlying model architecture and parameters. In this survey, we review the privacy concerns brought by deep learning, and the mitigating techniques introduced to tackle these issues. We also show that there is a gap in the literature regarding test-time inference privacy, and propose possible future research directions.

Privacy in Deep Learning: A Survey

TL;DR

Privacy in Deep Learning: A Survey analyzes privacy threats in DL arising from sensitive training data and model parameters, distinguishing direct exposure from indirect inference attacks. It surveys privacy-preserving mechanisms across data aggregation, training-time defenses (DP, HE, SMC), and inference-time protections, while highlighting a gap in test-time inference privacy. It also discusses privacy-enhancing execution models such as Federated Learning, Split Learning, and TEEs, and discusses practical trade-offs in utility and performance. The paper emphasizes the need for future research on test-time inference privacy and more holistic privacy assurances in DL deployments.

Abstract

The ever-growing advances of deep learning in many areas including vision, recommendation systems, natural language processing, etc., have led to the adoption of Deep Neural Networks (DNNs) in production systems. The availability of large datasets and high computational power are the main contributors to these advances. The datasets are usually crowdsourced and may contain sensitive information. This poses serious privacy concerns as this data can be misused or leaked through various vulnerabilities. Even if the cloud provider and the communication link is trusted, there are still threats of inference attacks where an attacker could speculate properties of the data used for training, or find the underlying model architecture and parameters. In this survey, we review the privacy concerns brought by deep learning, and the mitigating techniques introduced to tackle these issues. We also show that there is a gap in the literature regarding test-time inference privacy, and propose possible future research directions.

Paper Structure

This paper contains 28 sections, 3 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Categorization of existing threats against deep learning
  • Figure 2: The image on the left was recovered using the model inversion attack of Fredrikson et al. inversion-fred2. The image on the right shows an image from the training set. The attacker is given only the person’s name and access to a facial recognition system that returns a class confidence score inversion-fred2.
  • Figure 3: Categorization of privacy-preserving schemes for deep learning.
  • Figure 4: Overview of how a deep learning framework works and how differential privacy can be applied to different parts of the pipeline.
  • Figure 5: Cloak's discovered features for target DNN classifiers (VGG-16) for black-hair color, eyeglasses, gender, and smile detection. The colored features are conducive to the task. The 3 sets of features depicted for each task correspond to different suppression ratios (SR). AL denotes the range of accuracy loss imposed by the suppression.
  • ...and 2 more figures

Theorems & Definitions (2)

  • Definition 3.1
  • Definition 3.2