DeepFeatureX Net: Deep Features eXtractors based Network for discriminating synthetic from real images
Orazio Pontorno, Luca Guarnera, Sebastiano Battiato
TL;DR
The paper tackles the problem of detecting AI-generated images and the generalization gap when confronted with unseen generation architectures. It introduces a three-branch architecture where each Base Model specializes in discriminative features for DM-generated, GAN-generated, or real images, with their outputs concatenated into a final classifier. Training uses highly unbalanced binary tasks to induce architecture-focused feature extraction, and the final model demonstrates robustness to JPEG compression while achieving superior generalization compared to state-of-the-art methods. The dataset comprises 72k images from real, GAN, and DM sources, with a structured train/validation/testing split, and the work provides open-source code and data. Overall, the approach advances practical deepfake detection by enhancing cross-architecture generalization and resilience to common image perturbations.
Abstract
Deepfakes, synthetic images generated by deep learning algorithms, represent one of the biggest challenges in the field of Digital Forensics. The scientific community is working to develop approaches that can discriminate the origin of digital images (real or AI-generated). However, these methodologies face the challenge of generalization, that is, the ability to discern the nature of an image even if it is generated by an architecture not seen during training. This usually leads to a drop in performance. In this context, we propose a novel approach based on three blocks called Base Models, each of which is responsible for extracting the discriminative features of a specific image class (Diffusion Model-generated, GAN-generated, or real) as it is trained by exploiting deliberately unbalanced datasets. The features extracted from each block are then concatenated and processed to discriminate the origin of the input image. Experimental results showed that this approach not only demonstrates good robust capabilities to JPEG compression but also outperforms state-of-the-art methods in several generalization tests. Code, models and dataset are available at https://github.com/opontorno/block-based_deepfake-detection.
