Table of Contents
Fetching ...

Model Inversion Attacks: A Survey of Approaches and Countermeasures

Zhanke Zhou, Jianing Zhu, Fengfei Yu, Xuan Li, Xiong Peng, Tongliang Liu, Bo Han

TL;DR

This survey provides a formal and cross-domain view of model inversion attacks (MIAs), clarifying how attackers recover training data from a trained model under varying prior knowledge and access regimes. It categorizes image, text, and graph MIAs into optimization-based and training-based families, surveys representative methods, and discusses defense strategies across training-time and inference-time settings. The work catalogs datasets and metrics used to evaluate MIAs and defenses, highlights key challenges, and outlines future directions including foundation-model privacy and diffusion-based inversion techniques. Overall, the paper aims to unify the understanding of MIAs across modalities and guide the design of robust, privacy-preserving defenses with practical impact for real-world deployments.

Abstract

The success of deep neural networks has driven numerous research studies and applications from Euclidean to non-Euclidean data. However, there are increasing concerns about privacy leakage, as these networks rely on processing private data. Recently, a new type of privacy attack, the model inversion attacks (MIAs), aims to extract sensitive features of private data for training by abusing access to a well-trained model. The effectiveness of MIAs has been demonstrated in various domains, including images, texts, and graphs. These attacks highlight the vulnerability of neural networks and raise awareness about the risk of privacy leakage within the research community. Despite the significance, there is a lack of systematic studies that provide a comprehensive overview and deeper insights into MIAs across different domains. This survey aims to summarize up-to-date MIA methods in both attacks and defenses, highlighting their contributions and limitations, underlying modeling principles, optimization challenges, and future directions. We hope this survey bridges the gap in the literature and facilitates future research in this critical area. Besides, we are maintaining a repository to keep track of relevant research at https://github.com/AndrewZhou924/Awesome-model-inversion-attack.

Model Inversion Attacks: A Survey of Approaches and Countermeasures

TL;DR

This survey provides a formal and cross-domain view of model inversion attacks (MIAs), clarifying how attackers recover training data from a trained model under varying prior knowledge and access regimes. It categorizes image, text, and graph MIAs into optimization-based and training-based families, surveys representative methods, and discusses defense strategies across training-time and inference-time settings. The work catalogs datasets and metrics used to evaluate MIAs and defenses, highlights key challenges, and outlines future directions including foundation-model privacy and diffusion-based inversion techniques. Overall, the paper aims to unify the understanding of MIAs across modalities and guide the design of robust, privacy-preserving defenses with practical impact for real-world deployments.

Abstract

The success of deep neural networks has driven numerous research studies and applications from Euclidean to non-Euclidean data. However, there are increasing concerns about privacy leakage, as these networks rely on processing private data. Recently, a new type of privacy attack, the model inversion attacks (MIAs), aims to extract sensitive features of private data for training by abusing access to a well-trained model. The effectiveness of MIAs has been demonstrated in various domains, including images, texts, and graphs. These attacks highlight the vulnerability of neural networks and raise awareness about the risk of privacy leakage within the research community. Despite the significance, there is a lack of systematic studies that provide a comprehensive overview and deeper insights into MIAs across different domains. This survey aims to summarize up-to-date MIA methods in both attacks and defenses, highlighting their contributions and limitations, underlying modeling principles, optimization challenges, and future directions. We hope this survey bridges the gap in the literature and facilitates future research in this critical area. Besides, we are maintaining a repository to keep track of relevant research at https://github.com/AndrewZhou924/Awesome-model-inversion-attack.

Paper Structure

This paper contains 27 sections, 2 equations, 17 figures, 6 tables.

Figures (17)

  • Figure 1: Pipeline illustration of Model Inversion Attack (MIA) with respect to supervised machine learning, as well as its attack and defense targets. Given a released model trained on private data, MIA is to find a reverse hypothesis to recover its training data, while the defender attempts to make it unsuccessful. The notations and definitions are elaborated and further explained in Section \ref{['sec: problem-def']}.
  • Figure 2: Taxonomy of model inversion adversaries (in Section \ref{['sec: MIAs-image']} to \ref{['sec: defending-MIAs']}) regarding different domains. The full approaches and adopted scenarios are elaborated in the corresponding sections. Related datasets and evaluation metrics are summarized later (in Section \ref{['sec: dataset-and-evaluation']}).
  • Figure 3: Comparison of different privacy attacks with illustrations regarding the objective and specific privacy concerns.
  • Figure 4: An evolutionary graph of research works in Image MIA with the illustration of recovered examples.
  • Figure 5: The illustrations of model inversion attack on standard classification in image domain.
  • ...and 12 more figures

Theorems & Definitions (6)

  • Definition 2.1: Supervised machine learning
  • Definition 2.2: Model inversion attacks
  • Remark 2.1: Strict model inversion attacks
  • Remark 2.2: The extent of inversion
  • Definition 2.3: Target of MIAs
  • Definition 2.4: Target of defending against MIAs