DiffProtect: Generate Adversarial Examples with Diffusion Models for Facial Privacy Protection
Jiang Liu, Chun Pong Lau, Zhongliang Guo, Yuxiang Guo, Zhaoyang Wang, Rama Chellappa
TL;DR
DiffProtect introduces a diffusion-model–based approach to generate adversarial, semantically meaningful perturbations for facial privacy protection. By encoding images into a semantic latent and a stochastic code and applying a conditional DDIM, it crafts protected faces that fool face recognition systems while maintaining natural visual quality. The method incorporates face semantics regularization and a fast attack variant, DiffProtect-fast, to balance attack effectiveness and efficiency. Empirical results on CelebA-HQ and FFHQ show significantly higher attack success rates and better image realism than prior methods, with robust performance under common defenses and even real-world API conditions.
Abstract
The increasingly pervasive facial recognition (FR) systems raise serious concerns about personal privacy, especially for billions of users who have publicly shared their photos on social media. Several attempts have been made to protect individuals from being identified by unauthorized FR systems utilizing adversarial attacks to generate encrypted face images. However, existing methods suffer from poor visual quality or low attack success rates, which limit their utility. Recently, diffusion models have achieved tremendous success in image generation. In this work, we ask: can diffusion models be used to generate adversarial examples to improve both visual quality and attack performance? We propose DiffProtect, which utilizes a diffusion autoencoder to generate semantically meaningful perturbations on FR systems. Extensive experiments demonstrate that DiffProtect produces more natural-looking encrypted images than state-of-the-art methods while achieving significantly higher attack success rates, e.g., 24.5% and 25.1% absolute improvements on the CelebA-HQ and FFHQ datasets.
