Generalized Deepfake Attribution

Sowdagar Mahammad Shahid; Sudev Kumar Padhi; Umesh Kashyap; Sk. Subidh Ali

Generalized Deepfake Attribution

Sowdagar Mahammad Shahid, Sudev Kumar Padhi, Umesh Kashyap, Sk. Subidh Ali

TL;DR

The paper tackles the challenge of attributing GAN-generated images to their underlying architectures when generators are retrained with different seeds or fine-tuned. It introduces Generalized Deepfake Attribution Network (GDA-Net), comprising a Feature Extraction Network (FEN) and a multi-class classifier, with Vanilla-FEN and Denoiser-FEN variants that leverage supervised contrastive learning to extract architecture-dependent fingerprints and reduce content dependency via residuals from a denoising autoencoder. The approach demonstrates robust cross-seed and fine-tuning attribution across DCGAN, WGAN, ProGAN, and SNGAN, outperforming prior methods in generalization. The work has practical impact for forensic analysis and IP protection of GAN architectures, with code released to validate the results.

Abstract

The landscape of fake media creation changed with the introduction of Generative Adversarial Networks (GAN s). Fake media creation has been on the rise with the rapid advances in generation technology, leading to new challenges in Detecting fake media. A fundamental characteristic of GAN s is their sensitivity to parameter initialization, known as seeds. Each distinct seed utilized during training leads to the creation of unique model instances, resulting in divergent image outputs despite employing the same architecture. This means that even if we have one GAN architecture, it can produce countless variations of GAN models depending on the seed used. Existing methods for attributing deepfakes work well only if they have seen the specific GAN model during training. If the GAN architectures are retrained with a different seed, these methods struggle to attribute the fakes. This seed dependency issue made it difficult to attribute deepfakes with existing methods. We proposed a generalized deepfake attribution network (GDA-N et) to attribute fake images to their respective GAN architectures, even if they are generated from a retrained version of the GAN architecture with a different seed (cross-seed) or from the fine-tuned version of the existing GAN model. Extensive experiments on cross-seed and fine-tuned data of GAN models show that our method is highly effective compared to existing methods. We have provided the source code to validate our results.

Generalized Deepfake Attribution

TL;DR

Abstract

Paper Structure (21 sections, 1 equation, 6 figures, 4 tables)

This paper contains 21 sections, 1 equation, 6 figures, 4 tables.

Introduction
Related Work
Deepfake Attribution
Supervised Contrastive Learning
Proposed Approach
Problem definition
Overview
Feature extraction network(FEN)
Vanilla-$FEN$:
Denoiser-$FEN$:
Multi-class classification network:
Experiments
Setup
Dataset:
Model Architecture:
...and 6 more sections

Figures (6)

Figure 1: The key difference between the existing and the proposed method. Existing methods focus on model-level attribution, while our proposed method focuses on architecture-level attribution. Thus the existing methods fails when attribution is performed on the images generated from retrained or fine-tuned version of $GAN$ having the same architecture.
Figure 2: Supervised Contrastive learning brings embeddings of positive samples (image augmentation pair and image pair from the same class) closer and pushes negative samples (image pair from different classes) farther apart. In our case, image augmentations and the images generated from the same $GAN$ architecture are brought closer, while images generated from different $GAN$ architectures are pushed farther apart. The anchor image is generated from $SNGAN$. Thus, the embeddings of anchor's augmented image, and the image generated from retrained $SNGAN$ (seed-$2$) are brought close. In the same line, the embeddings of images generated from $SNGAN$ and $ProGAN$ are pushed further apart.
Figure 3: $GDA$-$Net$ architecture using Vanilla $FEN$ for attributing $GAN$ architectures. It consists of two networks$:$ Feature Extraction Network ($FEN$) and Classification Network. $FEN$ is trained by applying supervised contrastive loss on its 128-dimension embedding output. The intermediate layer output (2048-dimensional) of $FEN$ is used to train the classifier network for attribution.
Figure 4: $GDA$-$Net$ architecture using Denoiser $FEN$ for attributing $GAN$ architectures. It consists of three networks$:$ Denoising autoencoder ($DAE$), Feature Extraction Network ($FEN$), and Classification Network. $DAE$ and $FEN$ together are referred as Denoiser-$FEN$.
Figure 5: $TSNE$ plot of feature embeddings of training and testing data(cross-seed) generated from $FEN$ of Denoiser-$FEN$. Fig A represents feature embedding space for training data which contains data generated from multiple instances of $GAN$ architectures (trained with multiple seeds). Fig B represents feature embedding space of testing data which contains data generated from a completely new instance of $GAN$ architectures(cross-seed).
...and 1 more figures

Generalized Deepfake Attribution

TL;DR

Abstract

Generalized Deepfake Attribution

Authors

TL;DR

Abstract

Table of Contents

Figures (6)