Table of Contents
Fetching ...

Inference Attacks Against Face Recognition Model without Classification Layers

Yuanqing Huang, Huilong Chen, Yinggui Wang, Lei Wang

TL;DR

This work investigates privacy leakage in face recognition when the classification layer is absent during inference. It introduces a two-stage attack: first, membership inference exploiting distances between intermediate features and Batch Normalization statistics, with $d_i = \frac{1}{b_i} \|\overline{v}_i - u_i\|^2$ guiding a lightweight classifier, and second, a GAN-guided model inversion using StyleGAN in latent space to synthesize training-like faces. Experiments on CASIA-WebFace and MS1M-ArcFace show that the BN-based MI attack can outperform some baselines and that the inversion stage can recover partial training identities without using the classifier, though still lagging behind classifier-based attacks. The results reveal that defenses focused on the classification layer are insufficient for backbone-only FR systems, underscoring the need for privacy-preserving designs that address the backbone and BN statistics directly.

Abstract

Face recognition (FR) has been applied to nearly every aspect of daily life, but it is always accompanied by the underlying risk of leaking private information. At present, almost all attack models against FR rely heavily on the presence of a classification layer. However, in practice, the FR model can obtain complex features of the input via the model backbone, and then compare it with the target for inference, which does not explicitly involve the outputs of the classification layer adopting logit or other losses. In this work, we advocate a novel inference attack composed of two stages for practical FR models without a classification layer. The first stage is the membership inference attack. Specifically, We analyze the distances between the intermediate features and batch normalization (BN) parameters. The results indicate that this distance is a critical metric for membership inference. We thus design a simple but effective attack model that can determine whether a face image is from the training dataset or not. The second stage is the model inversion attack, where sensitive private data is reconstructed using a pre-trained generative adversarial network (GAN) guided by the attack model in the first stage. To the best of our knowledge, the proposed attack model is the very first in the literature developed for FR models without a classification layer. We illustrate the application of the proposed attack model in the establishment of privacy-preserving FR techniques.

Inference Attacks Against Face Recognition Model without Classification Layers

TL;DR

This work investigates privacy leakage in face recognition when the classification layer is absent during inference. It introduces a two-stage attack: first, membership inference exploiting distances between intermediate features and Batch Normalization statistics, with guiding a lightweight classifier, and second, a GAN-guided model inversion using StyleGAN in latent space to synthesize training-like faces. Experiments on CASIA-WebFace and MS1M-ArcFace show that the BN-based MI attack can outperform some baselines and that the inversion stage can recover partial training identities without using the classifier, though still lagging behind classifier-based attacks. The results reveal that defenses focused on the classification layer are insufficient for backbone-only FR systems, underscoring the need for privacy-preserving designs that address the backbone and BN statistics directly.

Abstract

Face recognition (FR) has been applied to nearly every aspect of daily life, but it is always accompanied by the underlying risk of leaking private information. At present, almost all attack models against FR rely heavily on the presence of a classification layer. However, in practice, the FR model can obtain complex features of the input via the model backbone, and then compare it with the target for inference, which does not explicitly involve the outputs of the classification layer adopting logit or other losses. In this work, we advocate a novel inference attack composed of two stages for practical FR models without a classification layer. The first stage is the membership inference attack. Specifically, We analyze the distances between the intermediate features and batch normalization (BN) parameters. The results indicate that this distance is a critical metric for membership inference. We thus design a simple but effective attack model that can determine whether a face image is from the training dataset or not. The second stage is the model inversion attack, where sensitive private data is reconstructed using a pre-trained generative adversarial network (GAN) guided by the attack model in the first stage. To the best of our knowledge, the proposed attack model is the very first in the literature developed for FR models without a classification layer. We illustrate the application of the proposed attack model in the establishment of privacy-preserving FR techniques.
Paper Structure (17 sections, 8 equations, 4 figures, 4 tables)

This paper contains 17 sections, 8 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: The schematic diagram of the training and inference process of the FR model. Our attack target is the backbone, while traditional attacks focus on the output of the classification layer which can be removed in the generalized FR process.
  • Figure 2: The diagram of the proposed two-stage inference attack against FR models without classification layers. The first stage is the membership inference attack. The attack model utilizes the parameters from the Batch Normalization layers to determine whether a sample belongs to the training dataset or not. The second stage is the model inversion attack. StyleGAN is adopted to synthesize images and optimize the results from the output of the first-stage attack.
  • Figure 3: Some results of the model inversion attack in case 2. The first and third rows show the original training images, while the second and fourth rows show the synthesis images generated by our proposed algorithm without using the classification layer.
  • Figure 4: Comparison of the generated images without (the 1st column) and with (the 2nd column) the loss we proposed in the paper. With the proposed loss, the generated images are more similar to the target person than that without using the proposed loss.