UGAD: Universal Generative AI Detector utilizing Frequency Fingerprints
Inzamamul Alam, Muhammad Shahid Muneer, Simon S. Woo
TL;DR
UGAD tackles the challenge of distinguishing real images from AI-generated fakes by integrating frequency-domain analysis in the YCbCr color space with a novel Radial Integral Operation (RIO) and a Spatial Fourier Unit (SFU) that together extract robust spectral-spatial features. A ResNet152 backbone fuses these features for classification, yielding superior accuracy and AUC across diverse GAN and diffusion-model datasets. Extensive ablations reveal the critical roles of YCbCr preprocessing, split-shift SFU operations, and the combined RIO+SFU pipeline, with practical inference-time feasibility (~400 ms). The approach is demonstrated on a large, heterogeneous dataset and deployed in a live detection system, underscoring its potential for real-world safe-guarding against AI-generated misinformation.
Abstract
In the wake of a fabricated explosion image at the Pentagon, an ability to discern real images from fake counterparts has never been more critical. Our study introduces a novel multi-modal approach to detect AI-generated images amidst the proliferation of new generation methods such as Diffusion models. Our method, UGAD, encompasses three key detection steps: First, we transform the RGB images into YCbCr channels and apply an Integral Radial Operation to emphasize salient radial features. Secondly, the Spatial Fourier Extraction operation is used for a spatial shift, utilizing a pre-trained deep learning network for optimal feature extraction. Finally, the deep neural network classification stage processes the data through dense layers using softmax for classification. Our approach significantly enhances the accuracy of differentiating between real and AI-generated images, as evidenced by a 12.64% increase in accuracy and 28.43% increase in AUC compared to existing state-of-the-art methods.
