Team NYCU at Defactify4: Robust Detection and Source Identification of AI-Generated Images Using CNN and CLIP-Based Models

Tsan-Tsung Yang; I-Wei Chen; Kuan-Ting Chen; Shang-Hsuan Chiang; Wen-Chih Peng

Team NYCU at Defactify4: Robust Detection and Source Identification of AI-Generated Images Using CNN and CLIP-Based Models

Tsan-Tsung Yang, I-Wei Chen, Kuan-Ting Chen, Shang-Hsuan Chiang, Wen-Chih Peng

TL;DR

This work tackles the dual problem of detecting AI-generated images and identifying their source models using two complementary pipelines: a CNN-based EfficientNet-B0 that ingests RGB images augmented with frequency-domain and reconstruction-error features, and a CLIP-ViT-based approach paired with an SVM classifier. Evaluated on the Defactify-4 dataset, both methods demonstrate strong performance, with CLIP-ViT offering superior robustness to perturbations common in real-world scenarios. The study includes thorough baselines comparisons, robustness tests, and ablation analyses showing that perturbation-aware data augmentation enhances generalization. The results suggest practical viability for authenticating digital media and attributing generation sources, with publicly available code for reproducibility and future work aimed at improving interpretability for attribution.

Abstract

With the rapid advancement of generative AI, AI-generated images have become increasingly realistic, raising concerns about creativity, misinformation, and content authenticity. Detecting such images and identifying their source models has become a critical challenge in ensuring the integrity of digital media. This paper tackles the detection of AI-generated images and identifying their source models using CNN and CLIP-ViT classifiers. For the CNN-based classifier, we leverage EfficientNet-B0 as the backbone and feed with RGB channels, frequency features, and reconstruction errors, while for CLIP-ViT, we adopt a pretrained CLIP image encoder to extract image features and SVM to perform classification. Evaluated on the Defactify 4 dataset, our methods demonstrate strong performance in both tasks, with CLIP-ViT showing superior robustness to image perturbations. Compared to baselines like AEROBLADE and OCC-CLIP, our approach achieves competitive results. Notably, our method ranked Top-3 overall in the Defactify 4 competition, highlighting its effectiveness and generalizability. All of our implementations can be found in https://github.com/uuugaga/Defactify_4

Team NYCU at Defactify4: Robust Detection and Source Identification of AI-Generated Images Using CNN and CLIP-Based Models

TL;DR

Abstract

Team NYCU at Defactify4: Robust Detection and Source Identification of AI-Generated Images Using CNN and CLIP-Based Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)