X-Transfer: A Transfer Learning-Based Framework for GAN-Generated Fake Image Detection
Lei Zhang, Hao Chen, Shu Hu, Bin Zhu, Ching Sheng Lin, Xi Wu, Jinrong Hu, Xin Wang
TL;DR
The paper tackles GAN-generated image detection under transfer learning by proposing X-Transfer, a dual-network framework with interleaved gradient updates to preserve source-domain knowledge while adapting to a target domain. It introduces an AUC-oriented loss to address imbalanced data and combines three transfer routes with a dynamic balancing factor, achieving state-of-the-art-like AUC results (up to $99.04\%$) on facial datasets and demonstrating strong generalization to non-face data. The approach improves transfer efficiency (fewer training epochs) and robustness to post-processing, supported by extensive experiments and ablation studies across datasets, data augmentations, and model comparisons. Overall, X-Transfer advances reliable GAN-image detection in practical, diverse settings and offers a promising path for broad applicability beyond facial imagery.
Abstract
Generative adversarial networks (GANs) have remarkably advanced in diverse domains, especially image generation and editing. However, the misuse of GANs for generating deceptive images, such as face replacement, raises significant security concerns, which have gained widespread attention. Therefore, it is urgent to develop effective detection methods to distinguish between real and fake images. Current research centers around the application of transfer learning. Nevertheless, it encounters challenges such as knowledge forgetting from the original dataset and inadequate performance when dealing with imbalanced data during training. To alleviate this issue, this paper introduces a novel GAN-generated image detection algorithm called X-Transfer, which enhances transfer learning by utilizing two neural networks that employ interleaved parallel gradient transmission. In addition, we combine AUC loss and cross-entropy loss to improve the model's performance. We carry out comprehensive experiments on multiple facial image datasets. The results show that our model outperforms the general transferring approach, and the best metric achieves 99.04%, which is increased by approximately 10%. Furthermore, we demonstrate excellent performance on non-face datasets, validating its generality and broader application prospects.
