Table of Contents
Fetching ...

High-Fidelity Face Content Recovery via Tamper-Resilient Versatile Watermarking

Peipeng Yu, Jinfeng Xie, Chengfu Ou, Xiaoyu Zhou, Jianwei Fei, Yunshu Dai, Zhihua Xia, Chip Hong Chang

Abstract

The proliferation of AIGC-driven face manipulation and deepfakes poses severe threats to media provenance, integrity, and copyright protection. Prior versatile watermarking systems typically rely on embedding explicit localization payloads, which introduces a fidelity--functionality trade-off: larger localization signals degrade visual quality and often reduce decoding robustness under strong generative edits. Moreover, existing methods rarely support content recovery, limiting their forensic value when original evidence must be reconstructed. To address these challenges, we present VeriFi, a versatile watermarking framework that unifies copyright protection, pixel-level manipulation localization, and high-fidelity face content recovery. VeriFi makes three key contributions: (1) it embeds a compact semantic latent watermark that serves as an content-preserving prior, enabling faithful restoration even after severe manipulations; (2) it achieves fine-grained localization without embedding localization-specific artifacts by correlating image features with decoded provenance signals; and (3) it introduces an AIGC attack simulator that combines latent-space mixing with seamless blending to improve robustness to realistic deepfake pipelines. Extensive experiments on CelebA-HQ and FFHQ show that VeriFi consistently outperforms strong baselines in watermark robustness, localization accuracy, and recovery quality, providing a practical and verifiable defense for deepfake forensics.

High-Fidelity Face Content Recovery via Tamper-Resilient Versatile Watermarking

Abstract

The proliferation of AIGC-driven face manipulation and deepfakes poses severe threats to media provenance, integrity, and copyright protection. Prior versatile watermarking systems typically rely on embedding explicit localization payloads, which introduces a fidelity--functionality trade-off: larger localization signals degrade visual quality and often reduce decoding robustness under strong generative edits. Moreover, existing methods rarely support content recovery, limiting their forensic value when original evidence must be reconstructed. To address these challenges, we present VeriFi, a versatile watermarking framework that unifies copyright protection, pixel-level manipulation localization, and high-fidelity face content recovery. VeriFi makes three key contributions: (1) it embeds a compact semantic latent watermark that serves as an content-preserving prior, enabling faithful restoration even after severe manipulations; (2) it achieves fine-grained localization without embedding localization-specific artifacts by correlating image features with decoded provenance signals; and (3) it introduces an AIGC attack simulator that combines latent-space mixing with seamless blending to improve robustness to realistic deepfake pipelines. Extensive experiments on CelebA-HQ and FFHQ show that VeriFi consistently outperforms strong baselines in watermark robustness, localization accuracy, and recovery quality, providing a practical and verifiable defense for deepfake forensics.
Paper Structure (23 sections, 8 equations, 8 figures, 8 tables)

This paper contains 23 sections, 8 equations, 8 figures, 8 tables.

Figures (8)

  • Figure 1: Overview of our VeriFi framework. Our versatile watermarking architecture simultaneously achieves robust copyright protection, precise forgery localization, and high-fidelity fake face recovery, providing comprehensive forensic capabilities essential for real-world deepfake defense scenarios.
  • Figure 2: Overview of our VeriFi. (1) Unified Recovery-Copyright Watermark Embedder $Enc$ inserts an ownership code $w_{cop}$ and a compact facial signature $z_{face}$. (2) AIGC Attack Simulator performs Latent Mixing and Poisson blending to mimic realistic deepfake attacks during training. (3) Watermark Extractor $Dec$ recovers $\hat{w}_{cop}$ and $\hat{z}_{face}$, produces an content proxy $\hat{I}_b$, and uses $\hat{w}_{cop}$ to guide the Forgery Locator to predict the manipulation map $\hat{M}$. (4) Watermark-Guided Deepfake Recovery Network uses a dual-stream Transformer with spatially gated cross-attention to fuse the edited image, $\hat{I}_b$, and $\hat{M}$ for selective restoration.
  • Figure 3: Overview of the proposed AIGC Attack Simulator. The simulator models AIGC edits via latent feature grafting and seamless image blending, enabling robust watermark training.
  • Figure 4: Visual comparison of tamper localization results across different methods.
  • Figure 5: Qualitative comparison of face recovery under SD Inpainting/HD-painter and Splicing.
  • ...and 3 more figures