Anonymization Prompt Learning for Facial Privacy-Preserving Text-to-Image Generation
Liang Shi, Jie Zhang, Shiguang Shan
TL;DR
This paper addresses the privacy risk of identity leakage in text-to-image diffusion by introducing Anonymization Prompt Learning (APL), a plug-and-play approach that prepends a learnable prompt prefix to prompts entering diffusion-based generators. APL trains the prefix to steer identity prompts toward anonymized facial outputs while preserving nonidentical content, using a dual data strategy and mixed losses so the prefix remains inactive for nonidentity prompts. Across multiple models including Stable Diffusion variants and Realistic Vision, APL achieves substantial reductions in identity recognition accuracy for generated faces with minimal degradation to image quality and text fidelity, and it generalizes to identities unseen during training as well as identities learned post hoc via personalization. The method provides a practical, parameter-free means for service platforms to mitigate deepfake risks without modifying core model parameters, enabling broad, transferable facial privacy protection.
Abstract
Text-to-image diffusion models, such as Stable Diffusion, generate highly realistic images from text descriptions. However, the generation of certain content at such high quality raises concerns. A prominent issue is the accurate depiction of identifiable facial images, which could lead to malicious deepfake generation and privacy violations. In this paper, we propose Anonymization Prompt Learning (APL) to address this problem. Specifically, we train a learnable prompt prefix for text-to-image diffusion models, which forces the model to generate anonymized facial identities, even when prompted to produce images of specific individuals. Extensive quantitative and qualitative experiments demonstrate the successful anonymization performance of APL, which anonymizes any specific individuals without compromising the quality of non-identity-specific image generation. Furthermore, we reveal the plug-and-play property of the learned prompt prefix, enabling its effective application across different pretrained text-to-image models for transferrable privacy and security protection against the risks of deepfakes.
