OmniDFA: A Unified Framework for Open Set Synthesis Image Detection and Few-Shot Attribution
Shiyu Wu, Shuyan Li, Jing Li, Jing Liu, Yequan Wang
TL;DR
The paper tackles the dual challenge of detecting AI-generated imagery and attributing its synthesis to unseen generators in open-set scenarios. It introduces OmniFake, a large-scale, class-aware synthetic image dataset spanning 45 generators, alongside a real-image collection, to enable robust open-set attribution research. OmniDFA integrates a dual-path feature extractor, supervised contrastive learning, and a learnable real-center with boundary constraints to jointly perform authenticity detection and open-set few-shot attribution. Across comprehensive experiments and cross-dataset validations, OmniDFA achieves state-of-the-art generalization in detection and strong open-set attribution with limited reference samples, highlighting the practical potential of model-architecture-aware attribution for forensic robustness. The authors provide dataset and code resources to foster further research in open-set forgery analysis and real-world deployment.
Abstract
AI-generated image (AIGI) detection and source model attribution remain central challenges in combating deepfake abuses, primarily due to the structural diversity of generative models. Current detection methods are prone to overfitting specific forgery traits, whereas source attribution offers a robust alternative through fine-grained feature discrimination. However, synthetic image attribution remains constrained by the scarcity of large-scale, well-categorized synthetic datasets, limiting its practicality and compatibility with detection systems. In this work, we propose a new paradigm for image attribution called open-set, few-shot source identification. This paradigm is designed to reliably identify unseen generators using only limited samples, making it highly suitable for real-world application. To this end, we introduce OmniDFA (Omni Detector and Few-shot Attributor), a novel framework for AIGI that not only assesses the authenticity of images, but also determines the synthesis origins in a few-shot manner. To facilitate this work, we construct OmniFake, a large class-aware synthetic image dataset that curates $1.17$ M images from $45$ distinct generative models, substantially enriching the foundational resources for research on both AIGI detection and attribution. Experiments demonstrate that OmniDFA exhibits excellent capability in open-set attribution and achieves state-of-the-art generalization performance on AIGI detection. Our dataset and code will be made available.
