Table of Contents
Fetching ...

A Closer Look at GAN Priors: Exploiting Intermediate Features for Enhanced Model Inversion Attacks

Yixiang Qiu, Hao Fang, Hongyao Yu, Bin Chen, MeiKang Qiu, Shu-Tao Xia

TL;DR

This work investigates privacy risks from model inversion attacks and proposes IF-GMI, a novel MI method that uses intermediate features of a pre-trained StyleGAN2 generator, not just latent codes. By disassembling the generator into blocks and optimizing both latent vectors and intermediate features under an $l_1$ ball constraint, IF-GMI achieves state-of-the-art attack performance, particularly in out-of-distribution scenarios, and demonstrates strong transferability across datasets and target models. The approach combines initial selection, hierarchical feature optimization, and a robust identity loss to recover high-fidelity, diverse images that closely resemble private data. These results underscore the potent privacy leakage risk from GAN priors and motivate the development of defenses to mitigate MI threats in practical deployments.

Abstract

Model Inversion (MI) attacks aim to reconstruct privacy-sensitive training data from released models by utilizing output information, raising extensive concerns about the security of Deep Neural Networks (DNNs). Recent advances in generative adversarial networks (GANs) have contributed significantly to the improved performance of MI attacks due to their powerful ability to generate realistic images with high fidelity and appropriate semantics. However, previous MI attacks have solely disclosed private information in the latent space of GAN priors, limiting their semantic extraction and transferability across multiple target models and datasets. To address this challenge, we propose a novel method, Intermediate Features enhanced Generative Model Inversion (IF-GMI), which disassembles the GAN structure and exploits features between intermediate blocks. This allows us to extend the optimization space from latent code to intermediate features with enhanced expressive capabilities. To prevent GAN priors from generating unrealistic images, we apply a L1 ball constraint to the optimization process. Experiments on multiple benchmarks demonstrate that our method significantly outperforms previous approaches and achieves state-of-the-art results under various settings, especially in the out-of-distribution (OOD) scenario. Our code is available at: https://github.com/final-solution/IF-GMI

A Closer Look at GAN Priors: Exploiting Intermediate Features for Enhanced Model Inversion Attacks

TL;DR

This work investigates privacy risks from model inversion attacks and proposes IF-GMI, a novel MI method that uses intermediate features of a pre-trained StyleGAN2 generator, not just latent codes. By disassembling the generator into blocks and optimizing both latent vectors and intermediate features under an ball constraint, IF-GMI achieves state-of-the-art attack performance, particularly in out-of-distribution scenarios, and demonstrates strong transferability across datasets and target models. The approach combines initial selection, hierarchical feature optimization, and a robust identity loss to recover high-fidelity, diverse images that closely resemble private data. These results underscore the potent privacy leakage risk from GAN priors and motivate the development of defenses to mitigate MI threats in practical deployments.

Abstract

Model Inversion (MI) attacks aim to reconstruct privacy-sensitive training data from released models by utilizing output information, raising extensive concerns about the security of Deep Neural Networks (DNNs). Recent advances in generative adversarial networks (GANs) have contributed significantly to the improved performance of MI attacks due to their powerful ability to generate realistic images with high fidelity and appropriate semantics. However, previous MI attacks have solely disclosed private information in the latent space of GAN priors, limiting their semantic extraction and transferability across multiple target models and datasets. To address this challenge, we propose a novel method, Intermediate Features enhanced Generative Model Inversion (IF-GMI), which disassembles the GAN structure and exploits features between intermediate blocks. This allows us to extend the optimization space from latent code to intermediate features with enhanced expressive capabilities. To prevent GAN priors from generating unrealistic images, we apply a L1 ball constraint to the optimization process. Experiments on multiple benchmarks demonstrate that our method significantly outperforms previous approaches and achieves state-of-the-art results under various settings, especially in the out-of-distribution (OOD) scenario. Our code is available at: https://github.com/final-solution/IF-GMI
Paper Structure (31 sections, 5 equations, 6 figures, 12 tables, 1 algorithm)

This paper contains 31 sections, 5 equations, 6 figures, 12 tables, 1 algorithm.

Figures (6)

  • Figure 1: (a) Comparison of our proposed IF-GMI with baselines. The blue number below the images is the predicted confidence by the evaluation model. The first column shows the randomly generated images and the second column presents the reconstructed results by PPA struppek2022plug, a typical GAN-based method focusing on directly optimizing the latent code of GAN model. The last two columns exhibit the results of our proposed IF-GMI and the ground truth images in the private dataset, respectively. (b) Top-$1$ attack accuracy of PPA and IF-GMI (ours) on four OOD scenarios.
  • Figure 2: Overview of our proposed IF-GMI. Firstly, the latent vectors are sampled from standard Gaussian distribution and mapped into disentangled latent codes with semantic meanings by Mapping Network. Then we perform random augmentation on these latent codes to select optimal ones denoted as $\mathbf{w}^*$ for optimization. The Synthesis Network is disassembled into multiple blocks to search the intermediate features, which are successively updated with the identity loss calculated from the target model. Finally, the reconstructed images are generated from the last layer as results.
  • Figure 3: (a) Comparison of $Acc@1$ metric under various settings of $L$ (i.e., the number of intermediate features). (b) Visual results generated from different end layers. We define $L=0$ as a special case that our method degenerates into merely optimizing the latent vectors $\mathbf{w}$.
  • Figure 4: Visual comparison of reconstructed images from different methods against the ResNet-18 trained on FaceScrub. The first column shows ground truth images of the target class in the private dataset.
  • Figure A1: Visual comparison of reconstructed images from different methods against the ResNet-18 resnet trained on FaceScrub. The GAN prior is pre-trained on MetFaces. The first column shows ground truth images of the target class in the private dataset.
  • ...and 1 more figures