Table of Contents
Fetching ...

Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning

Briland Hitaj, Giuseppe Ateniese, Fernando Perez-Cruz

TL;DR

The paper reveals a novel insider-driven GAN attack that undermines privacy in collaborative deep learning by enabling reconstruction of targeted private data, even when differential privacy is applied to shared parameters. It demonstrates that record-level DP does not guard against active, real-time adversaries who can deceive victims and augment leakage through GAN-generated samples. Through experiments on MNIST and AT&T, the authors show GANs outperform traditional model inversion in leaking information under various collaboration and privacy settings, underscoring security risks in federated/decentralized learning. The work argues for stronger protections beyond record-level DP and highlights the need for careful consideration of insider threats and potential cryptographic or device-level privacy mechanisms in distributed learning systems.

Abstract

Deep Learning has recently become hugely popular in machine learning, providing significant improvements in classification accuracy in the presence of highly-structured and large databases. Researchers have also considered privacy implications of deep learning. Models are typically trained in a centralized manner with all the data being processed by the same training algorithm. If the data is a collection of users' private data, including habits, personal pictures, geographical positions, interests, and more, the centralized server will have access to sensitive information that could potentially be mishandled. To tackle this problem, collaborative deep learning models have recently been proposed where parties locally train their deep learning structures and only share a subset of the parameters in the attempt to keep their respective training sets private. Parameters can also be obfuscated via differential privacy (DP) to make information extraction even more challenging, as proposed by Shokri and Shmatikov at CCS'15. Unfortunately, we show that any privacy-preserving collaborative deep learning is susceptible to a powerful attack that we devise in this paper. In particular, we show that a distributed, federated, or decentralized deep learning approach is fundamentally broken and does not protect the training sets of honest participants. The attack we developed exploits the real-time nature of the learning process that allows the adversary to train a Generative Adversarial Network (GAN) that generates prototypical samples of the targeted training set that was meant to be private (the samples generated by the GAN are intended to come from the same distribution as the training data). Interestingly, we show that record-level DP applied to the shared parameters of the model, as suggested in previous work, is ineffective (i.e., record-level DP is not designed to address our attack).

Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning

TL;DR

The paper reveals a novel insider-driven GAN attack that undermines privacy in collaborative deep learning by enabling reconstruction of targeted private data, even when differential privacy is applied to shared parameters. It demonstrates that record-level DP does not guard against active, real-time adversaries who can deceive victims and augment leakage through GAN-generated samples. Through experiments on MNIST and AT&T, the authors show GANs outperform traditional model inversion in leaking information under various collaboration and privacy settings, underscoring security risks in federated/decentralized learning. The work argues for stronger protections beyond record-level DP and highlights the need for careful consideration of insider threats and potential cryptographic or device-level privacy mechanisms in distributed learning systems.

Abstract

Deep Learning has recently become hugely popular in machine learning, providing significant improvements in classification accuracy in the presence of highly-structured and large databases. Researchers have also considered privacy implications of deep learning. Models are typically trained in a centralized manner with all the data being processed by the same training algorithm. If the data is a collection of users' private data, including habits, personal pictures, geographical positions, interests, and more, the centralized server will have access to sensitive information that could potentially be mishandled. To tackle this problem, collaborative deep learning models have recently been proposed where parties locally train their deep learning structures and only share a subset of the parameters in the attempt to keep their respective training sets private. Parameters can also be obfuscated via differential privacy (DP) to make information extraction even more challenging, as proposed by Shokri and Shmatikov at CCS'15. Unfortunately, we show that any privacy-preserving collaborative deep learning is susceptible to a powerful attack that we devise in this paper. In particular, we show that a distributed, federated, or decentralized deep learning approach is fundamentally broken and does not protect the training sets of honest participants. The attack we developed exploits the real-time nature of the learning process that allows the adversary to train a Generative Adversarial Network (GAN) that generates prototypical samples of the targeted training set that was meant to be private (the samples generated by the GAN are intended to come from the same distribution as the training data). Interestingly, we show that record-level DP applied to the shared parameters of the model, as suggested in previous work, is ineffective (i.e., record-level DP is not designed to address our attack).

Paper Structure

This paper contains 30 sections, 1 theorem, 2 equations, 18 figures, 1 algorithm.

Key Result

Theorem 5.1

The global minimum of the virtual training criterion in GAN is achieved if and only if $p(\mathbf{x}) = p(g(\mathbf{z};\theta_G))$.

Figures (18)

  • Figure 1: Two approaches for distributed deep learning. In (a), the red links show sharing of the data between the users and the server. Only the server can compromise the privacy of the data. In (b), the red links show sharing of the model parameters. In this case a malicious user employing a GAN can deceive any victim into releasing their private information.
  • Figure 2: Picture of Alice on the victim's phone, $X$, and its GAN reconstruction, $X'$. Note that $X'\neq X$, and $X'$ was not in the training set. But $X'$ is essentially indistinguishable from $X$.
  • Figure 3: GAN-generated samples for the 'horse' class from the CIFAR-10 dataset
  • Figure 4: GAN Attack on collaborative deep learning. The victim on the right trains the model with images of 3s (class $a$) and images of 1s (class $b$). The adversary only has images of class $b$ (1s) and uses its label $c$ and a GAN to fool the victim into releasing information about class $a$. The attack can be easily generalized to several classes and users. The adversary does not even need to start with any true samples.
  • Figure 5: Results obtained when running model inversion attack (MIA) and a generative adversarial network (DCGAN) on CNN trained on the MNIST dataset. MIA fails to produce clear results, while DCGAN is successful.
  • ...and 13 more figures

Theorems & Definitions (1)

  • Theorem 5.1