Local Statistics for Generative Image Detection
Yung Jer Wong, Teck Khim Ng
TL;DR
The paper addresses the problem of distinguishing real digital camera images from diffusion-model generated images. It introduces three localized feature sets that exploit Bayer pattern traces and spatial non-stationarity, avoiding deep learning in favor of interpretable, low-cost features. The approach demonstrates strong robustness to image resizing and JPEG compression and generalizes well to unseen diffusion models, outperforming the DIRE detector in cross-dataset tests. This yields a practical forensic tool for reliable media authenticity assessment with limited training data.
Abstract
Diffusion models (DMs) are generative models that learn to synthesize images from Gaussian noise. DMs can be trained to do a variety of tasks such as image generation and image super-resolution. Researchers have made significant improvements in the capability of synthesizing photorealistic images in the past few years. These successes also hasten the need to address the potential misuse of synthesized images. In this paper, we highlighted the effectiveness of Bayer pattern and local statistics in distinguishing digital camera images from DM-generated images. We further hypothesized that local statistics should be used to address the spatial non-stationarity problems in images. We showed that our approach produced promising results for distinguishing real images from synthesized images. This approach is also robust to various perturbations such as image resizing and JPEG compression.
