Table of Contents
Fetching ...

FLUID: Training-Free Face De-identification via Latent Identity Substitution

Jinhyeong Park, Shaheryar Muhammad, Seangmin Lee, Jong Taek Lee, Soon Ki Jung

Abstract

We present FLUID (Face de-identification in the Latent space via Utility-preserving Identity Displacement), a training-free framework that directly substitutes identity in the latent space of pretrained diffusion models. Inspired by substitution mechanisms in chemistry, we reinterpret identity editing as semantic displacement in the latent h-space of a pretrained unconditional diffusion model. Our framework discovers identity-editing directions through optimization guided by novel reagent losses, which supervise for attribute preservation and identity suppression. We further propose both linear and geodesic (tangent-based) editing schemes to effectively navigate the latent manifold. Experimental results on CelebA-HQ and FFHQ demonstrate that FLUID achieves a superior trade-off between identity suppression and attribute preservation, outperforming state-of-the-art de-identification methods in both qualitative and quantitative metrics.

FLUID: Training-Free Face De-identification via Latent Identity Substitution

Abstract

We present FLUID (Face de-identification in the Latent space via Utility-preserving Identity Displacement), a training-free framework that directly substitutes identity in the latent space of pretrained diffusion models. Inspired by substitution mechanisms in chemistry, we reinterpret identity editing as semantic displacement in the latent h-space of a pretrained unconditional diffusion model. Our framework discovers identity-editing directions through optimization guided by novel reagent losses, which supervise for attribute preservation and identity suppression. We further propose both linear and geodesic (tangent-based) editing schemes to effectively navigate the latent manifold. Experimental results on CelebA-HQ and FFHQ demonstrate that FLUID achieves a superior trade-off between identity suppression and attribute preservation, outperforming state-of-the-art de-identification methods in both qualitative and quantitative metrics.

Paper Structure

This paper contains 50 sections, 23 equations, 12 figures, 8 tables, 1 algorithm.

Figures (12)

  • Figure 1: The concept of our approach with the metaphor of substitution reaction. Identity editing within the latent space replaces the original identity (Id) with a de-identified one (Id*), while preserving attributes of a human face image (Hu), such as emotion (Em), gender (Ge), and Pose (Po), strictly than state-of-the-art de-identification methods.
  • Figure 2: An illustration of our framework. In the initialization procedure, $x$ is inverted into the latent space of a DM to obtain a starting point $h$, a latent vector in the $h$-space, while three auxiliary models are used to get $F_x$, $A_x$, and $M$. During optimization, a direction vector $\Delta h$ is iteratively guided by three loss functions to compare $x$ and $\hat{x}$: an identity loss $L_{ID}$, an attribute preservation loss $L_{att}$, and a face mask loss $L_{mask}$. Decoding $\hat{h}$, the combination of $h$ and $\Delta h$, by either linear or tangent edit results in an edited image $\hat{x}$. Note that $\Delta h$ is initialized randomly in the first optimization step.
  • Figure 3: Qualitative comparison with SOTA face de-identification methods on in-domain (CelebA-HQ) and out-of-domain (FFHQ) images. FLUID produces de-identified face images with stricter attribute preservation than other methods. While CelebA-HQ aligns with the pretrained DM’s distribution, FFHQ highlights FLUID’s generalization ability to more diverse real-world inputs.
  • Figure 4: Qualitative results of ablation study for loss function combinations. The top two rows display CelebA-HQ samples, while the bottom three show results on FFHQ.
  • Figure 5: Identity-attribute trade-off curve across increasing linear edit strengths.
  • ...and 7 more figures