Table of Contents
Fetching ...

Social Media Authentication and Combating Deepfakes using Semi-fragile Invisible Image Watermarking

Aakash Varma Nadimpalli, Ajita Rattani

TL;DR

Thorough experimental investigations on SOTA facial Deepfake datasets demonstrate that the proposed watermarking framework can embed a \(64\) -bit secret as an imperceptible image watermark that can be recovered with a high-bit recovery accuracy when benign image processing operations are applied while being non-recoverable when unseen Deepfake manipulations are applied.

Abstract

With the significant advances in deep generative models for image and video synthesis, Deepfakes and manipulated media have raised severe societal concerns. Conventional machine learning classifiers for deepfake detection often fail to cope with evolving deepfake generation technology and are susceptible to adversarial attacks. Alternatively, invisible image watermarking is being researched as a proactive defense technique that allows media authentication by verifying an invisible secret message embedded in the image pixels. A handful of invisible image watermarking techniques introduced for media authentication have proven vulnerable to basic image processing operations and watermark removal attacks. In response, we have proposed a semi-fragile image watermarking technique that embeds an invisible secret message into real images for media authentication. Our proposed watermarking framework is designed to be fragile to facial manipulations or tampering while being robust to benign image-processing operations and watermark removal attacks. This is facilitated through a unique architecture of our proposed technique consisting of critic and adversarial networks that enforce high image quality and resiliency to watermark removal efforts, respectively, along with the backbone encoder-decoder and the discriminator networks. Thorough experimental investigations on SOTA facial Deepfake datasets demonstrate that our proposed model can embed a $64$-bit secret as an imperceptible image watermark that can be recovered with a high-bit recovery accuracy when benign image processing operations are applied while being non-recoverable when unseen Deepfake manipulations are applied. In addition, our proposed watermarking technique demonstrates high resilience to several white-box and black-box watermark removal attacks. Thus, obtaining state-of-the-art performance.

Social Media Authentication and Combating Deepfakes using Semi-fragile Invisible Image Watermarking

TL;DR

Thorough experimental investigations on SOTA facial Deepfake datasets demonstrate that the proposed watermarking framework can embed a -bit secret as an imperceptible image watermark that can be recovered with a high-bit recovery accuracy when benign image processing operations are applied while being non-recoverable when unseen Deepfake manipulations are applied.

Abstract

With the significant advances in deep generative models for image and video synthesis, Deepfakes and manipulated media have raised severe societal concerns. Conventional machine learning classifiers for deepfake detection often fail to cope with evolving deepfake generation technology and are susceptible to adversarial attacks. Alternatively, invisible image watermarking is being researched as a proactive defense technique that allows media authentication by verifying an invisible secret message embedded in the image pixels. A handful of invisible image watermarking techniques introduced for media authentication have proven vulnerable to basic image processing operations and watermark removal attacks. In response, we have proposed a semi-fragile image watermarking technique that embeds an invisible secret message into real images for media authentication. Our proposed watermarking framework is designed to be fragile to facial manipulations or tampering while being robust to benign image-processing operations and watermark removal attacks. This is facilitated through a unique architecture of our proposed technique consisting of critic and adversarial networks that enforce high image quality and resiliency to watermark removal efforts, respectively, along with the backbone encoder-decoder and the discriminator networks. Thorough experimental investigations on SOTA facial Deepfake datasets demonstrate that our proposed model can embed a -bit secret as an imperceptible image watermark that can be recovered with a high-bit recovery accuracy when benign image processing operations are applied while being non-recoverable when unseen Deepfake manipulations are applied. In addition, our proposed watermarking technique demonstrates high resilience to several white-box and black-box watermark removal attacks. Thus, obtaining state-of-the-art performance.
Paper Structure (22 sections, 11 equations, 12 figures, 15 tables)

This paper contains 22 sections, 11 equations, 12 figures, 15 tables.

Figures (12)

  • Figure 1: Overview of our proposed framework that involves embedding a secret encrypted message into an image using an encoder-decoder style network for the purpose of media authentication. This watermark is imperceptible to the human eye and resistant to typical image alterations and watermark removal attacks, but it is vulnerable to malicious facial transformations i.e., Deepfakes.
  • Figure 2: Overview of our proposed semi-fragile watermarking technique based on U-Net-based encoder-decoder architecture for media authentication. Training the encoder $E_{\alpha}$ and decoder $D_{\beta}$ network involves encouraging message retrieval from watermarked images that have undergone benign modifications and discouraging retrieval from watermarked images that have undergone malicious changes. The critic $C$ network is in charge of obtaining a critic score based on the quality of the image by estimating how "real" or "authentic" the images appear. The adversary network $A_{adv}$ mimics the efforts of an intruder to remove the watermark for adversarial purposes. The imperceptibility of the watermark is guaranteed by image reconstruction and adversarial loss from the discriminator $A_{\gamma}$. The loss functions proposed associated with all networks in our proposed model are also shown in the figure.
  • Figure 3: Pictorial representation of watermarked output $x_{w}$ when original image $x$ is given to our proposed model for watermarking.
  • Figure 4: (a) Illustration of Gaussian blur on invisible watermarked images using different kernel sizes and the $\sigma$ values. (b) Application of JPEG compression to invisible watermarked images at different compression rates ranging from 25 → 75 [best viewed in Zoom].
  • Figure 5: (a) Application of unseen benign transforms on invisible watermarked images. Instagram filters like Brooklyn, Clarendon, and Aden are examples of benign transformations shown in this diagram. (b) Combined application of unseen benign transforms on invisible watermarked images. In this work, Instagram filters such as Brooklyn, Clarendon, and Aden are used. The symbol (A+B) in the figure denotes the combined application of the Brooklyn and Aden filters to the watermarked image. Similarly, (B+C) in the figure denotes the combined application of Brooklyn and Clarendon filters to the watermarked image. Finally, (A+B+C) in the figure denotes the combined application of Aden, Brooklyn, and Clarendon filters to the watermarked image.
  • ...and 7 more figures