SIDBench: A Python Framework for Reliably Assessing Synthetic Image Detection Methods
Manos Schinas, Symeon Papadopoulos
TL;DR
The paper addresses the mismatch between SID research benchmarks and real-world performance by introducing SIDBench, a modular Python framework that standardizes evaluation across diverse SID models and datasets, including GANs, diffusion models, and high-resolution Synthbuster data. It integrates 11 detection methods with varied input features and backbones, and evaluates them using ACC, AP, TPR, and TNR under realistic image transformations. Key findings show fingerprints-based detectors generalize better across generators, but high-resolution diffusion-era images pose new challenges; threshold calibration can improve real-world performance but universal thresholds remain difficult. SIDBench thus provides a practical tool for robust, transformation-aware SID evaluation and guides future improvements in detection and benchmarking practice.
Abstract
The generative AI technology offers an increasing variety of tools for generating entirely synthetic images that are increasingly indistinguishable from real ones. Unlike methods that alter portions of an image, the creation of completely synthetic images presents a unique challenge and several Synthetic Image Detection (SID) methods have recently appeared to tackle it. Yet, there is often a large gap between experimental results on benchmark datasets and the performance of methods in the wild. To better address the evaluation needs of SID and help close this gap, this paper introduces a benchmarking framework that integrates several state-of-the-art SID models. Our selection of integrated models was based on the utilization of varied input features, and different network architectures, aiming to encompass a broad spectrum of techniques. The framework leverages recent datasets with a diverse set of generative models, high level of photo-realism and resolution, reflecting the rapid improvements in image synthesis technology. Additionally, the framework enables the study of how image transformations, common in assets shared online, such as JPEG compression, affect detection performance. SIDBench is available on https://github.com/mever-team/sidbench and is designed in a modular manner to enable easy inclusion of new datasets and SID models.
