Table of Contents
Fetching ...

IdentityGuard: Context-Aware Restriction and Provenance for Personalized Synthesis

Lingyun Zhang, Yu Xie, Ping Chen

Abstract

The nature of personalized text-to-image models poses a unique safety challenge that generic context-blind methods are ill-equipped to handle. Such global filters create a dilemma: to prevent misuse, they are forced to damage the model's broader utility by erasing concepts entirely, causing unacceptable collateral damage.Our work presents a more precisely targeted approach, built on the principle that security should be as context-aware as the threat itself, intrinsically bound to the personalized concept. We present IDENTITYGUARD, which realizes this principle through a conditional restriction that blocks harmful content only when combined with the personalized identity, and a concept-specific watermark for precise traceability. Experiments show our approach prevents misuse while preserving the model's utility and enabling robust traceability. By moving beyond blunt, global filters, our work demonstrates a more effective and responsible path toward AI safety.

IdentityGuard: Context-Aware Restriction and Provenance for Personalized Synthesis

Abstract

The nature of personalized text-to-image models poses a unique safety challenge that generic context-blind methods are ill-equipped to handle. Such global filters create a dilemma: to prevent misuse, they are forced to damage the model's broader utility by erasing concepts entirely, causing unacceptable collateral damage.Our work presents a more precisely targeted approach, built on the principle that security should be as context-aware as the threat itself, intrinsically bound to the personalized concept. We present IDENTITYGUARD, which realizes this principle through a conditional restriction that blocks harmful content only when combined with the personalized identity, and a concept-specific watermark for precise traceability. Experiments show our approach prevents misuse while preserving the model's utility and enabling robust traceability. By moving beyond blunt, global filters, our work demonstrates a more effective and responsible path toward AI safety.
Paper Structure (10 sections, 3 equations, 3 figures, 2 tables)

This paper contains 10 sections, 3 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: The motivation for IDENTITYGUARD. Generic, context-blind security methods (middle column) force an unacceptable trade-off: they either destroy the user's identity when blocking a threat, or destroy the model's utility on benign prompts. Our method, by binding safeguards directly to the personalized concept, is the only one to succeed in both scenarios. It defends against misuse while preserving the model's performance on general prompts.
  • Figure 2: The IDENTITYGUARD fine-tuning framework. Our method trains a single Denoising U-Net using two conditional paths. (Top Path) For benign personalized prompts, our Concept-Bound Provenance is activated, embedding a watermark. (Bottom Path) For malicious prompts, our novel Semantic Redirection Loss is activated, redirecting the output towards a safe, identity-preserving result by aligning the noise predictions of the malicious and benign prompts. Here, $c^*$ is the embedding for the personalized concept, $c_p$ is for the prohibited concept, and $\text{sg}(\cdot)$ is the stop-gradient operator.
  • Figure 3: Qualitative analysis of our context-aware security. The key failure of generic methods like Erasing Concept(ESD) is revealed in the bottom rows: to provide protection in the personalized context (top rows), they are forced to inflict catastrophic collateral damage, globally erasing the concept of "fire" and failing to generate a simple campfire. In contrast, IDENTITYGUARD's safeguard is intelligently bound only to the personalized identity, allowing it to preserve the concept for general use.