Table of Contents
Fetching ...

Rethinking Model Inversion Attacks With Patch-Wise Reconstruction

Jonggyu Jang, Hyeonsu Lyu, Hyun Jong Yang

TL;DR

The Patch-MI method, inspired by a jigsaw puzzle, is proposed, which offers a novel probabilistic interpretation of MI attacks, and effectively creates images that closely mimic the distribution of image patches in the target dataset by patch-based reconstruction.

Abstract

Model inversion (MI) attacks aim to infer or reconstruct the training dataset through reverse-engineering from the target model's weights. Recently, significant advancements in generative models have enabled MI attacks to overcome challenges in producing photo-realistic replicas of the training dataset, a technique known as generative MI. The generative MI primarily focuses on identifying latent vectors that correspond to specific target labels, leveraging a generative model trained with an auxiliary dataset. However, an important aspect is often overlooked: the MI attacks fail if the pre-trained generative model lacks the coverage to create an image corresponding to the target label, especially when there is a significant difference between the target and auxiliary datasets. To address this gap, we propose the Patch-MI method, inspired by a jigsaw puzzle, which offers a novel probabilistic interpretation of MI attacks. Even with a dissimilar auxiliary dataset, our method effectively creates images that closely mimic the distribution of image patches in the target dataset by patch-based reconstruction. Moreover, we numerically demonstrate that the Patch-MI improves Top 1 attack accuracy by 5\%p compared to existing methods.

Rethinking Model Inversion Attacks With Patch-Wise Reconstruction

TL;DR

The Patch-MI method, inspired by a jigsaw puzzle, is proposed, which offers a novel probabilistic interpretation of MI attacks, and effectively creates images that closely mimic the distribution of image patches in the target dataset by patch-based reconstruction.

Abstract

Model inversion (MI) attacks aim to infer or reconstruct the training dataset through reverse-engineering from the target model's weights. Recently, significant advancements in generative models have enabled MI attacks to overcome challenges in producing photo-realistic replicas of the training dataset, a technique known as generative MI. The generative MI primarily focuses on identifying latent vectors that correspond to specific target labels, leveraging a generative model trained with an auxiliary dataset. However, an important aspect is often overlooked: the MI attacks fail if the pre-trained generative model lacks the coverage to create an image corresponding to the target label, especially when there is a significant difference between the target and auxiliary datasets. To address this gap, we propose the Patch-MI method, inspired by a jigsaw puzzle, which offers a novel probabilistic interpretation of MI attacks. Even with a dissimilar auxiliary dataset, our method effectively creates images that closely mimic the distribution of image patches in the target dataset by patch-based reconstruction. Moreover, we numerically demonstrate that the Patch-MI improves Top 1 attack accuracy by 5\%p compared to existing methods.
Paper Structure (16 sections, 4 theorems, 18 equations, 7 figures, 5 tables)

This paper contains 16 sections, 4 theorems, 18 equations, 7 figures, 5 tables.

Key Result

Lemma 1

By using assp:ineq_during_training, if $\mathbf{x}\sim q(\mathbf{x})$, the following inequality is satisfied:

Figures (7)

  • Figure 1: Illustration of our motivation. Using an 8x8 patch-wise discriminator, the Patch-MI method successfully synthesizes target images (English alphabet) from the auxiliary dataset (Korean Hangeul), even though the auxiliary dataset is dissimilar to the Alphabet dataset.
  • Figure 2: Illustration of the Patch-MI attack method. For the generator, we simply reuse the standard DCGAN structure for image generation. For the patch-wise discriminator, we commence by applying a Conv2d layer, utilizing a filter whose size corresponds to the patch size and a stride size determined by the subtraction of overlapped pixels from the patch size. Subsequently, 1x1 convolution layers calculate $D_i$. Furthermore, the generated images are subjected to random transformation before being forwarded to the target classifier.
  • Figure 3: Examples of the generated images of our method and canonical GAN. Both generative models are trained with the MNIST dataset, i.e., handwritten digits from 0 to 9. Within our approach, the patch and stride sizes of the discriminator are set at 8 and 4, respectively, given an image size of 32. As evident from the figure, our technique is capable of creating images that are not contained within the MNIST dataset.
  • Figure 4: Depiction of the target dataset and auxiliary dataset for two distinct experiments. In \ref{['subfig:exp1_setup']}, the target dataset is designated as MNIST, and the auxiliary dataset is identified as SERI95. Analogously, the target dataset in \ref{['subfig:exp2_setup']} is EMNIST-letter.
  • Figure 5: Visualization of the randomly chosen outputs for various MI attack methods on the experiments 1 and 2.
  • ...and 2 more figures

Theorems & Definitions (6)

  • Definition 1: Sampling probability inequality
  • Lemma 1
  • Lemma 2
  • Theorem 1
  • proof
  • Proposition 1