Table of Contents
Fetching ...

Mitigating the Impact of Attribute Editing on Face Recognition

Sudipta Banerjee, Sai Pranaswi Mullangi, Shruti Wagle, Chinmay Hegde, Nasir Memon

TL;DR

This work addresses the vulnerability of automated face recognition to semantic facial attribute edits produced by diffusion-based generative models. It introduces a two-tier mitigation framework: global editing using DreamBooth-based regularization with a contrastive loss and a local editing approach using ControlNet-guided inpainting with depth and segmentation conditioning, designed to preserve identity while enabling diverse attribute changes. Across CelebA, CelebAMaskHQ, and LFW, the proposed methods (DB-prop for global edits and CN-IP for local edits) substantially improve biometric fidelity compared to strong baselines, with LLaVA-based automated attribute validation supporting editing accuracy. The findings highlight both the potential risks of attribute editing for evasion and the effectiveness of two complementary strategies for maintaining identity, offering practical guidance for safer deployment of editing tools in identity-sensitive contexts.

Abstract

Through a large-scale study over diverse face images, we show that facial attribute editing using modern generative AI models can severely degrade automated face recognition systems. This degradation persists even with identity-preserving generative models. To mitigate this issue, we propose two novel techniques for local and global attribute editing. We empirically ablate twenty-six facial semantic, demographic and expression-based attributes that have been edited using state-of-the-art generative models, and evaluate them using ArcFace and AdaFace matchers on CelebA, CelebAMaskHQ and LFW datasets. Finally, we use LLaVA, an emerging visual question-answering framework for attribute prediction to validate our editing techniques. Our methods outperform the current state-of-the-art at facial editing (BLIP, InstantID) while improving identity retention by a significant extent.

Mitigating the Impact of Attribute Editing on Face Recognition

TL;DR

This work addresses the vulnerability of automated face recognition to semantic facial attribute edits produced by diffusion-based generative models. It introduces a two-tier mitigation framework: global editing using DreamBooth-based regularization with a contrastive loss and a local editing approach using ControlNet-guided inpainting with depth and segmentation conditioning, designed to preserve identity while enabling diverse attribute changes. Across CelebA, CelebAMaskHQ, and LFW, the proposed methods (DB-prop for global edits and CN-IP for local edits) substantially improve biometric fidelity compared to strong baselines, with LLaVA-based automated attribute validation supporting editing accuracy. The findings highlight both the potential risks of attribute editing for evasion and the effectiveness of two complementary strategies for maintaining identity, offering practical guidance for safer deployment of editing tools in identity-sensitive contexts.

Abstract

Through a large-scale study over diverse face images, we show that facial attribute editing using modern generative AI models can severely degrade automated face recognition systems. This degradation persists even with identity-preserving generative models. To mitigate this issue, we propose two novel techniques for local and global attribute editing. We empirically ablate twenty-six facial semantic, demographic and expression-based attributes that have been edited using state-of-the-art generative models, and evaluate them using ArcFace and AdaFace matchers on CelebA, CelebAMaskHQ and LFW datasets. Finally, we use LLaVA, an emerging visual question-answering framework for attribute prediction to validate our editing techniques. Our methods outperform the current state-of-the-art at facial editing (BLIP, InstantID) while improving identity retention by a significant extent.
Paper Structure (12 sections, 4 equations, 13 figures, 7 tables)

This paper contains 12 sections, 4 equations, 13 figures, 7 tables.

Figures (13)

  • Figure 1: (a) The global editing framework uses a pre-trained latent diffusion model (DreamBooth) and fine-tunes it with few samples of a target individual and a regularization set containing image-caption pairs belonging to different attributes. The model uses contrastive learning to generate attribute-edited images while maintaining biometric fidelity. This method operates in txt2img mode. (b) The local editing framework uses a pre-trained latent diffusion model (SD) along with Control Net (CN) for inpainting guided via segmentation masks. CN uses conditional input such as depth map or edge map for detail preservation and mask for attribute editing. Note white pixels in the mask indicate the local regions to be edited while preserving fidelity with the reference input image. This method operates in both txt2img and img2img modes. Both methods use a text prompt to edit the input with desired attribute.
  • Figure 2: Outputs on LFW dataset using the baseline method DB-base. 'No attrib': No attribute editing; 'Dub chin': Double chin; 'Eyebrows': Bushy eyebrows and 'Mo open': Mouth slightly open.
  • Figure 3: Outputs on LFW dataset using our global attribute editing method DB-prop. Note Bald, Angry, Male, Female edited images maintain original identity in contrast to DB-base in Fig. \ref{['fig:LFWDBbase']}.
  • Figure 4: Fine-grained local attribute editing by our CN-IP method. The first image corresponds to editing only the lower lip. The second image edits both lips to orange color. We achieve this by adding the lower and upper lip masks and then performing the editing operation. The third image corresponds to adding necktie using the neck mask and the fourth image corresponds to editing the eye color to blue by adding both left and right eye masks.
  • Figure 5: Baselines (InstantID and BLIPDiffusion) vs. Our Proposed method (CN-IP). In (a), we show examples of generated images for unseen attributes. Note BLIPDiffusion requires a reference image so in absence of a reference image of a person with blue hair, it defaults to the reconstruction of the original image. InstantID is unable to successfully perform editing for rare unseen attributes. In (b), we show examples pertaining to three different types of attributes: hat, young and male.
  • ...and 8 more figures