Table of Contents
Fetching ...

Modeling Spoof Noise by De-spoofing Diffusion and its Application in Face Anti-spoofing

Bin Zhang, Xiangyu Zhu, Xiaoyu Zhang, Zhen Lei

TL;DR

This paper tackles adaptive spoofing in face recognition by introducing a de-spoofing diffusion framework that disentangles spoof noise from spoof images and recovers genuine content. It avoids reliance on a guide classifier by leveraging two diffusion models to translate spoof images toward the genuine domain, enabling explicit noise pattern extraction. The extracted spoof noise is fused with RGB information in a two-stream DepthNet-inspired network, trained with depth supervision to improve liveness discrimination. Across CASIA-MFSD, Replay-Attack, OULU-NPU, and SiW, the method achieves competitive intra-dataset results and superior cross-dataset generalization, underscoring diffusion models as a powerful tool for modeling spoof noise and enhancing face security.

Abstract

Face anti-spoofing is crucial for ensuring the security and reliability of face recognition systems. Several existing face anti-spoofing methods utilize GAN-like networks to detect presentation attacks by estimating the noise pattern of a spoof image and recovering the corresponding genuine image. But GAN's limited face appearance space results in the denoised faces cannot cover the full data distribution of genuine faces, thereby undermining the generalization performance of such methods. In this work, we present a pioneering attempt to employ diffusion models to denoise a spoof image and restore the genuine image. The difference between these two images is considered as the spoof noise, which can serve as a discriminative cue for face anti-spoofing. We evaluate our proposed method on several intra-testing and inter-testing protocols, where the experimental results showcase the effectiveness of our method in achieving competitive performance in terms of both accuracy and generalization.

Modeling Spoof Noise by De-spoofing Diffusion and its Application in Face Anti-spoofing

TL;DR

This paper tackles adaptive spoofing in face recognition by introducing a de-spoofing diffusion framework that disentangles spoof noise from spoof images and recovers genuine content. It avoids reliance on a guide classifier by leveraging two diffusion models to translate spoof images toward the genuine domain, enabling explicit noise pattern extraction. The extracted spoof noise is fused with RGB information in a two-stream DepthNet-inspired network, trained with depth supervision to improve liveness discrimination. Across CASIA-MFSD, Replay-Attack, OULU-NPU, and SiW, the method achieves competitive intra-dataset results and superior cross-dataset generalization, underscoring diffusion models as a powerful tool for modeling spoof noise and enhancing face security.

Abstract

Face anti-spoofing is crucial for ensuring the security and reliability of face recognition systems. Several existing face anti-spoofing methods utilize GAN-like networks to detect presentation attacks by estimating the noise pattern of a spoof image and recovering the corresponding genuine image. But GAN's limited face appearance space results in the denoised faces cannot cover the full data distribution of genuine faces, thereby undermining the generalization performance of such methods. In this work, we present a pioneering attempt to employ diffusion models to denoise a spoof image and restore the genuine image. The difference between these two images is considered as the spoof noise, which can serve as a discriminative cue for face anti-spoofing. We evaluate our proposed method on several intra-testing and inter-testing protocols, where the experimental results showcase the effectiveness of our method in achieving competitive performance in terms of both accuracy and generalization.
Paper Structure (18 sections, 5 equations, 3 figures, 5 tables)

This paper contains 18 sections, 5 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: Our proposed de-spoofing diffusion method decomposes a face image into its genuine counterpart and spoof noise pattern. The noise pattern, along with the RGB input, is then utilized to train the discriminator for the accurate detection of presentation attacks.
  • Figure 2: Our method leverage two ODEs for image de-spoofing. Given a source image $x^{s}$ (e.g. a spoof image or genuine image), the first ODE runs in the forward direction to convert it to the latent $x^{l}$, while the target, reverse ODE then constructs the target genuine image $x^{g}$.
  • Figure 3: The de-spoofing results of genuine and spoofing faces are shown. The noise patterns extracted from the spoof faces are noticeably more prominent than those extracted from genuine faces. In contrast, the noise patterns extracted from genuine faces are more similar to the random noise error that occurs during the inverse process.