ID-Guard: A Universal Framework for Combating Facial Manipulation via Breaking Identification
Zuomin Qu, Wei Lu, Xiangyang Luo, Qian Wang, Xiaochun Cao
TL;DR
ID-Guard tackles the problem of facial manipulation by introducing a universal, proactive defense that generates cross-model transferable perturbations in a single forward pass. It combines an Identity Destruction Module to erase identifiable facial features with a dynamic, multi-task training scheme (including MGDA and KPI-based strategies) and a gradient prior perturbation to stabilize optimization. The approach delivers strong cross-model disruption across multiple open-source manipulators, effectively degrading identity information as measured by $L_{2}^{face}$ and ID similarity, while remaining inconspicuous within a perturbation bound $\|\delta\|_\infty \le \epsilon$. It also demonstrates practical utility as an adversarial training module and remains robust under lossy operations and in gray-box settings, offering a versatile tool for mitigating face stigmatization and enhancing resilience of downstream systems. Overall, ID-Guard provides a scalable, plug-and-play framework for proactive defense against facial manipulation with tangible societal and technical impact.
Abstract
The misuse of deep learning-based facial manipulation poses a significant threat to civil rights. To prevent this fraud at its source, proactive defense has been proposed to disrupt the manipulation process by adding invisible adversarial perturbations into images, making the forged output unconvincing to observers. However, the non-specific disruption against the output may lead to the retention of identifiable facial features, potentially resulting in the stigmatization of the individual. This paper proposes a universal framework for combating facial manipulation, termed ID-Guard. Specifically, this framework operates with a single forward pass of an encoder-decoder network to produce a cross-model transferable adversarial perturbation. A novel Identity Destruction Module (IDM) is introduced to degrade identifiable features in forged faces. We optimize the perturbation generation by framing the disruption of different facial manipulations as a multi-task learning problem, and a dynamic weight strategy is devised to enhance cross-model performance. Experimental results demonstrate that the proposed ID-Guard exhibits strong efficacy in defending against various facial manipulation models, effectively degrading identifiable regions in manipulated images. It also enables disrupted images to evade facial inpainting and image recognition systems. Additionally, ID-Guard can seamlessly function as a plug-and-play component, integrating with other tasks such as adversarial training.
