DeepFeatureX Net: Deep Features eXtractors based Network for discriminating synthetic from real images

Orazio Pontorno; Luca Guarnera; Sebastiano Battiato

DeepFeatureX Net: Deep Features eXtractors based Network for discriminating synthetic from real images

Orazio Pontorno, Luca Guarnera, Sebastiano Battiato

TL;DR

The paper tackles the problem of detecting AI-generated images and the generalization gap when confronted with unseen generation architectures. It introduces a three-branch architecture where each Base Model specializes in discriminative features for DM-generated, GAN-generated, or real images, with their outputs concatenated into a final classifier. Training uses highly unbalanced binary tasks to induce architecture-focused feature extraction, and the final model demonstrates robustness to JPEG compression while achieving superior generalization compared to state-of-the-art methods. The dataset comprises 72k images from real, GAN, and DM sources, with a structured train/validation/testing split, and the work provides open-source code and data. Overall, the approach advances practical deepfake detection by enhancing cross-architecture generalization and resilience to common image perturbations.

Abstract

Deepfakes, synthetic images generated by deep learning algorithms, represent one of the biggest challenges in the field of Digital Forensics. The scientific community is working to develop approaches that can discriminate the origin of digital images (real or AI-generated). However, these methodologies face the challenge of generalization, that is, the ability to discern the nature of an image even if it is generated by an architecture not seen during training. This usually leads to a drop in performance. In this context, we propose a novel approach based on three blocks called Base Models, each of which is responsible for extracting the discriminative features of a specific image class (Diffusion Model-generated, GAN-generated, or real) as it is trained by exploiting deliberately unbalanced datasets. The features extracted from each block are then concatenated and processed to discriminate the origin of the input image. Experimental results showed that this approach not only demonstrates good robust capabilities to JPEG compression but also outperforms state-of-the-art methods in several generalization tests. Code, models and dataset are available at https://github.com/opontorno/block-based_deepfake-detection.

DeepFeatureX Net: Deep Features eXtractors based Network for discriminating synthetic from real images

TL;DR

Abstract

Paper Structure (11 sections, 2 equations, 2 figures, 4 tables)

This paper contains 11 sections, 2 equations, 2 figures, 4 tables.

Introduction
Related Works
Dataset details
Proposed Method
Training of Base Models
Overall architecture
Experimental results
Inference and robustness tests
Comparison with S.O.T.A. in generalization
Conclusion and future works
Acknowledgements

Figures (2)

Figure 1: Entire pipeline of the proposed method. (a) shows the process of dividing the training dataset into three unbalanced subsets, each with respect to a specific class (DM, GAN, real) used for training a specific Base Model. (b) illustrates the architecture of the final model, which takes the three Base Models $\phi_c$ trained in the previous phase with frozen weights, and uses them to extract the features from a digital image $\phi_c(\mathcal{I})$, where $c \in \mathcal{C}=\{\small{DM}, \small{GAN}, \small{REAL}\}$. These are then concatenated in channel dimension $\phi(\mathcal{I})=\phi_{\small{DM}}(\mathcal{I})\oplus\phi_{\small{GAN}}(\mathcal{I})\oplus\phi_{\small{REAL}}(\mathcal{I})$ and processed to solve the classification task.
Figure 2: Image variation as JPEG compression Quality Factor decreases. On the left raw image, at center JPEG compressed image at Quality Factor 80, and on the right the image at QF 50. Image generated by StyleGAN2 karras2020analyzing.

DeepFeatureX Net: Deep Features eXtractors based Network for discriminating synthetic from real images

TL;DR

Abstract

DeepFeatureX Net: Deep Features eXtractors based Network for discriminating synthetic from real images

Authors

TL;DR

Abstract

Table of Contents

Figures (2)