Table of Contents
Fetching ...

Privacy Leakage on DNNs: A Survey of Model Inversion Attacks and Defenses

Hao Fang, Yixiang Qiu, Hongyao Yu, Wenbo Yu, Jiawei Kong, Baoli Chong, Bin Chen, Xuan Wang, Shu-Tao Xia, Ke Xu

TL;DR

This paper elaborately analyze and compare numerous recent attacks and defenses on Deep Neural Networks (DNNs) across multiple modalities and learning tasks, and summarizes and classify these methods into different categories and provides a novel taxonomy.

Abstract

Deep Neural Networks (DNNs) have revolutionized various domains with their exceptional performance across numerous applications. However, Model Inversion (MI) attacks, which disclose private information about the training dataset by abusing access to the trained models, have emerged as a formidable privacy threat. Given a trained network, these attacks enable adversaries to reconstruct high-fidelity data that closely aligns with the private training samples, posing significant privacy concerns. Despite the rapid advances in the field, we lack a comprehensive and systematic overview of existing MI attacks and defenses. To fill this gap, this paper thoroughly investigates this realm and presents a holistic survey. Firstly, our work briefly reviews early MI studies on traditional machine learning scenarios. We then elaborately analyze and compare numerous recent attacks and defenses on Deep Neural Networks (DNNs) across multiple modalities and learning tasks. By meticulously analyzing their distinctive features, we summarize and classify these methods into different categories and provide a novel taxonomy. Finally, this paper discusses promising research directions and presents potential solutions to open issues. To facilitate further study on MI attacks and defenses, we have implemented an open-source model inversion toolbox on GitHub (https://github.com/ffhibnese/Model-Inversion-Attack-ToolBox).

Privacy Leakage on DNNs: A Survey of Model Inversion Attacks and Defenses

TL;DR

This paper elaborately analyze and compare numerous recent attacks and defenses on Deep Neural Networks (DNNs) across multiple modalities and learning tasks, and summarizes and classify these methods into different categories and provides a novel taxonomy.

Abstract

Deep Neural Networks (DNNs) have revolutionized various domains with their exceptional performance across numerous applications. However, Model Inversion (MI) attacks, which disclose private information about the training dataset by abusing access to the trained models, have emerged as a formidable privacy threat. Given a trained network, these attacks enable adversaries to reconstruct high-fidelity data that closely aligns with the private training samples, posing significant privacy concerns. Despite the rapid advances in the field, we lack a comprehensive and systematic overview of existing MI attacks and defenses. To fill this gap, this paper thoroughly investigates this realm and presents a holistic survey. Firstly, our work briefly reviews early MI studies on traditional machine learning scenarios. We then elaborately analyze and compare numerous recent attacks and defenses on Deep Neural Networks (DNNs) across multiple modalities and learning tasks. By meticulously analyzing their distinctive features, we summarize and classify these methods into different categories and provide a novel taxonomy. Finally, this paper discusses promising research directions and presents potential solutions to open issues. To facilitate further study on MI attacks and defenses, we have implemented an open-source model inversion toolbox on GitHub (https://github.com/ffhibnese/Model-Inversion-Attack-ToolBox).
Paper Structure (54 sections, 16 equations, 17 figures, 1 table)

This paper contains 54 sections, 16 equations, 17 figures, 1 table.

Figures (17)

  • Figure 1: An illustration of model inversion attacks. The service provider trains the model $f_0(\cdot)$ with private dataset $X$ and obtains $f_{\theta}(\cdot)$ that is then released for public usage. During the model inference stage, the adversary inverts the available $f_{\theta}(\cdot)$ to acquire recovered data $\hat{X}$, which highly resembles the inaccessible $X$.
  • Figure 2: Threat model of generative model inversion attacks against image classification tasks.
  • Figure 4: Threat model of generative model inversion attacks against image classification tasks.
  • Figure 5: An example of privacy leakage on LLMs.
  • Figure 6: Different types of MI attacks on graph data.
  • ...and 12 more figures