Rethinking Impersonation and Dodging Attacks on Face Recognition Systems
Fengfan Zhou, Qianyu Zhou, Bangjie Yin, Hui Zheng, Xuequan Lu, Lizhuang Ma, Hefei Ling
TL;DR
Face recognition systems can be deceived by adversarial examples, yet success in impersonation does not guarantee dodging in black-box FR due to open-set, multi-identity samples. The authors introduce Adv-Pruning, a three-stage attack (Priming, Pruning, Restoration) that uses Adversarial Priority Quantification to prune low-impact perturbations and Biased Gradient Adaptation to bias remaining perturbations toward dodging, while preserving impersonation. They formalize impersonation ($\mathcal{L}^{i}$) and dodging ($\mathcal{L}^{d}$) losses and show that a multi-task objective $\mathcal{L} = \lambda \mathcal{L}^{i} + \mathcal{L}^{d}$ suffers in black-box settings; Adv-Pruning mitigates this by freeing space for dodging-focused perturbations. Extensive experiments across datasets and models demonstrate significantly improved dodging ASR with minimal loss to impersonation, including under JPEG compression and on adversarially robust FR models, indicating practical impact for assessing FR security.
Abstract
Face Recognition (FR) systems can be easily deceived by adversarial examples that manipulate benign face images through imperceptible perturbations. Adversarial attacks on FR encompass two types: impersonation (targeted) attacks and dodging (untargeted) attacks. Previous methods often achieve a successful impersonation attack on FR, however, it does not necessarily guarantee a successful dodging attack on FR in the black-box setting. In this paper, our key insight is that the generation of adversarial examples should perform both impersonation and dodging attacks simultaneously. To this end, we propose a novel attack method termed as Adversarial Pruning (Adv-Pruning), to fine-tune existing adversarial examples to enhance their dodging capabilities while preserving their impersonation capabilities. Adv-Pruning consists of Priming, Pruning, and Restoration stages. Concretely, we propose Adversarial Priority Quantification to measure the region-wise priority of original adversarial perturbations, identifying and releasing those with minimal impact on absolute model output variances. Then, Biased Gradient Adaptation is presented to adapt the adversarial examples to traverse the decision boundaries of both the attacker and victim by adding perturbations favoring dodging attacks on the vacated regions, preserving the prioritized features of the original perturbations while boosting dodging performance. As a result, we can maintain the impersonation capabilities of original adversarial examples while effectively enhancing dodging capabilities. Comprehensive experiments demonstrate the superiority of our method compared with state-of-the-art adversarial attack methods.
