ID$^3$: Identity-Preserving-yet-Diversified Diffusion Models for Synthetic Face Recognition

Shen Li; Jianqing Xu; Jiaying Wu; Miao Xiong; Ailin Deng; Jiazhen Ji; Yuge Huang; Wenjie Feng; Shouhong Ding; Bryan Hooi

ID$^3$: Identity-Preserving-yet-Diversified Diffusion Models for Synthetic Face Recognition

Shen Li, Jianqing Xu, Jiaying Wu, Miao Xiong, Ailin Deng, Jiazhen Ji, Yuge Huang, Wenjie Feng, Shouhong Ding, Bryan Hooi

TL;DR

A diffusion-fueled SFR model that employs an ID-preserving loss to generate diverse yet identity-consistent facial appearances is introduced, and it is shown that minimizing this loss is equivalent to maximizing the lower bound of an adjusted conditional log-likelihood over ID-preserving data.

Abstract

Synthetic face recognition (SFR) aims to generate synthetic face datasets that mimic the distribution of real face data, which allows for training face recognition models in a privacy-preserving manner. Despite the remarkable potential of diffusion models in image generation, current diffusion-based SFR models struggle with generalization to real-world faces. To address this limitation, we outline three key objectives for SFR: (1) promoting diversity across identities (inter-class diversity), (2) ensuring diversity within each identity by injecting various facial attributes (intra-class diversity), and (3) maintaining identity consistency within each identity group (intra-class identity preservation). Inspired by these goals, we introduce a diffusion-fueled SFR model termed $\text{ID}^3$. $\text{ID}^3$ employs an ID-preserving loss to generate diverse yet identity-consistent facial appearances. Theoretically, we show that minimizing this loss is equivalent to maximizing the lower bound of an adjusted conditional log-likelihood over ID-preserving data. This equivalence motivates an ID-preserving sampling algorithm, which operates over an adjusted gradient vector field, enabling the generation of fake face recognition datasets that approximate the distribution of real-world faces. Extensive experiments across five challenging benchmarks validate the advantages of $\text{ID}^3$.

ID$^3$: Identity-Preserving-yet-Diversified Diffusion Models for Synthetic Face Recognition

TL;DR

Abstract

employs an ID-preserving loss to generate diverse yet identity-consistent facial appearances. Theoretically, we show that minimizing this loss is equivalent to maximizing the lower bound of an adjusted conditional log-likelihood over ID-preserving data. This equivalence motivates an ID-preserving sampling algorithm, which operates over an adjusted gradient vector field, enabling the generation of fake face recognition datasets that approximate the distribution of real-world faces. Extensive experiments across five challenging benchmarks validate the advantages of

Paper Structure (26 sections, 2 theorems, 34 equations, 8 figures, 3 tables, 3 algorithms)

This paper contains 26 sections, 2 theorems, 34 equations, 8 figures, 3 tables, 3 algorithms.

Introduction
Problem Formulation
Methodology
Diffusion Models
$\text{ID}^3$ as Conditional Diffusion Models
Identity Conditioning Signal
Face Attribute Conditioning Signal
Optimization Objective
ID-Preserving Sampling
Synthetic Dataset Generation
Experiments
Dataset
Training Dataset:
Implementation Details
Performance Evaluation
...and 11 more sections

Key Result

Theorem 3.1

Minimizing $\mathcal{L}$ with regard to $\boldsymbol{\theta}$ is equivalent to minimizing the upper bound of an adjusted conditional data negative log-likelihood $-\log \Tilde{p}(\mathbf{x} | \mathbf{y}, \mathbf{s})$, i.e.: where

Figures (8)

Figure 1: The forward pass of $\text{ID}^3$ in terms of loss computation. Given an image, its face attributes, and its face embedding, $\text{ID}^3$ obtains the image's noised version after $t$ diffusion steps and employs a denoising network to denoise it. This denoising process is conditioned on the predicted attributes and the ID embedding. Optimization proceeds by minimizing a loss function comprised of a denoising term, a one-step reconstruction term, an inner-product term, and a constant.
Figure 2: Synthetic Dataset Generation
Figure 3: Uncurated samples generated by $\text{ID}^3$ (Top) and those by IDiff-Face (Bottom).
Figure : Training Algorithm
Figure A.1: An illustration of the dataset-generating algorithm.
...and 3 more figures

Theorems & Definitions (5)

Theorem 3.1
proof
Lemma A.1
proof
proof

ID$^3$: Identity-Preserving-yet-Diversified Diffusion Models for Synthetic Face Recognition

TL;DR

Abstract

ID$^3$: Identity-Preserving-yet-Diversified Diffusion Models for Synthetic Face Recognition

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (5)