Table of Contents
Fetching ...

Towards Generalized Proactive Defense against Face Swapping with Contour-Hybrid Watermark

Ruiyang Xia, Dawei Zhou, Decheng Liu, Lin Yuan, Jie Li, Nannan Wang, Xinbo Gao

TL;DR

This work generalizes face swapping detection without requiring any swapping techniques during training and the storage of large-scale messages in advance, while embedding the contour texture and face identity information to achieve progressive image determination.

Abstract

Face swapping, recognized as a privacy and security concern, has prompted considerable defensive research. With the advancements in AI-generated content, the discrepancies between the real and swapped faces have become nuanced. Considering the difficulty of forged traces detection, we shift the focus to the face swapping purpose and proactively embed elaborate watermarks against unknown face swapping techniques. Given that the constant purpose is to swap the original face identity while preserving the background, we concentrate on the regions surrounding the face to ensure robust watermark generation, while embedding the contour texture and face identity information to achieve progressive image determination. The watermark is located in the facial contour and contains hybrid messages, dubbed the contour-hybrid watermark (CMark). Our approach generalizes face swapping detection without requiring any swapping techniques during training and the storage of large-scale messages in advance. Experiments conducted across 8 face swapping techniques demonstrate the superiority of our approach compared with state-of-the-art passive and proactive detectors while achieving a favorable balance between the image quality and watermark robustness.

Towards Generalized Proactive Defense against Face Swapping with Contour-Hybrid Watermark

TL;DR

This work generalizes face swapping detection without requiring any swapping techniques during training and the storage of large-scale messages in advance, while embedding the contour texture and face identity information to achieve progressive image determination.

Abstract

Face swapping, recognized as a privacy and security concern, has prompted considerable defensive research. With the advancements in AI-generated content, the discrepancies between the real and swapped faces have become nuanced. Considering the difficulty of forged traces detection, we shift the focus to the face swapping purpose and proactively embed elaborate watermarks against unknown face swapping techniques. Given that the constant purpose is to swap the original face identity while preserving the background, we concentrate on the regions surrounding the face to ensure robust watermark generation, while embedding the contour texture and face identity information to achieve progressive image determination. The watermark is located in the facial contour and contains hybrid messages, dubbed the contour-hybrid watermark (CMark). Our approach generalizes face swapping detection without requiring any swapping techniques during training and the storage of large-scale messages in advance. Experiments conducted across 8 face swapping techniques demonstrate the superiority of our approach compared with state-of-the-art passive and proactive detectors while achieving a favorable balance between the image quality and watermark robustness.

Paper Structure

This paper contains 23 sections, 13 equations, 11 figures, 12 tables, 1 algorithm.

Figures (11)

  • Figure 1: Comparison of proactive detection. Previous methods embed a watermark with a single-type message into the entire image. However, we embed the watermark into the facial contour to ensure robustness, while integrating identity and contour texture (Tex.) into the message to achieve progressive determination.
  • Figure 2: (a) Qualitative and quantitative face swapping comparisons. 'M-FS', 'R-FS', 'S-FS', and 'D-FS' denote manual method (FaceSwap), reconstruction-based methods (MobileFS, Faceshifter, and SimSwap), StyleGAN-based methods (E4S and MegaFS), and Diffusion-based methods (DiffSwap and DiffFace). FID is computed by averaging the features from different layers of inception. (b) Qualitative and quantitative contour landmark comparisons under FaceSwap faceswap on 256$\times$256 resolution.
  • Figure 3: Pipeline of our proposed CMark model. Firstly, the client generates message $\textbf{V}^r$ by integrating features from the contour texture extractor (CT-E) and face identity extractor (ID-E). The message is then embedded into the facial contour through watermark encoder $\mathcal{E}$ after scaling by a strength factor $\alpha$. Secondly, for the watermarked image $\hat{\textbf{X}}^{r}$ subjected to random manipulations, the robust watermark is decoded by inputting the contour region of the manipulated $\hat{\textbf{X}}^{r}$ into the watermark decoder $\mathcal{D}$. Finally, after sending the encrypted messages to the platform for decryption, the decoded message $\hat{\textbf{V}}$ is verified against the reference message $\bar{\textbf{V}}$ to determine the watermark existence and the image authenticity. 'Gen.' denotes 'Generation'.
  • Figure 4: Illustration of the facial contour mask generation.$\text{MD}(\cdot)$ denotes the morphological dilation function. This process confines the watermark to sub-regions of the background, rendering it unaffected by internal facial manipulations.
  • Figure 5: Illustration of the proposed denoiser modules, $\mathcal{P}_1$ and $\mathcal{P}_2$, during the training phase.
  • ...and 6 more figures