Detecting the Undetectable: Combining Kolmogorov-Arnold Networks and MLP for AI-Generated Image Detection
Taharim Rahman Anon, Jakaria Islam Emon
TL;DR
The study tackles the rising challenge of distinguishing real from AI-generated images in the era of advanced generators like DALL-E 3, MidJourney, and Stable Diffusion 3. It couples semantic CLIP embeddings with a baseline MLP and introduces a Hybrid KAN-MLP that leverages a KANLinear module with adaptive spline-based feature transformation, achieving superior out-of-distribution robustness. A new dataset combines real RAISE images with AI images generated by multiple modern models, supplemented by a rigorous OOD test set to probe generalization. Empirically, the Hybrid KAN-MLP delivers higher F1 and AUC on three OOD pairs (Real vs. DALL-E 3, Real vs. MidJourney 5, Real vs. Firefly) than the baseline, demonstrating the value of high-resolution, adaptive feature mappings in forensic detection. The work highlights practical impact for media integrity and digital forensics, while acknowledging data-collection costs and proposing scalable, cost-efficient avenues for future deployment.
Abstract
As artificial intelligence progresses, the task of distinguishing between real and AI-generated images is increasingly complicated by sophisticated generative models. This paper presents a novel detection framework adept at robustly identifying images produced by cutting-edge generative AI models, such as DALL-E 3, MidJourney, and Stable Diffusion 3. We introduce a comprehensive dataset, tailored to include images from these advanced generators, which serves as the foundation for extensive evaluation. we propose a classification system that integrates semantic image embeddings with a traditional Multilayer Perceptron (MLP). This baseline system is designed to effectively differentiate between real and AI-generated images under various challenging conditions. Enhancing this approach, we introduce a hybrid architecture that combines Kolmogorov-Arnold Networks (KAN) with the MLP. This hybrid model leverages the adaptive, high-resolution feature transformation capabilities of KAN, enabling our system to capture and analyze complex patterns in AI-generated images that are typically overlooked by conventional models. In out-of-distribution testing, our proposed model consistently outperformed the standard MLP across three out of distribution test datasets, demonstrating superior performance and robustness in classifying real images from AI-generated images with impressive F1 scores.
