FerretNet: Efficient Synthetic Image Detection via Local Pixel Dependencies
Shuqiao Liang, Jian Liu, Renzhang Chen, Quanlong Guan
TL;DR
FerretNet introduces Local Pixel Dependencies (LPD) as a universal artifact representation to detect synthetic images across GANs, VAEs, and LDMs. LPD uses zero-masked median reconstruction to reveal local texture and edge disruptions, and FerretNet is a compact detector with 1.1M parameters leveraging depthwise separable and dilated convolutions. It achieves 97.1% accuracy across 22 generative models and superior efficiency (772 FPS on RTX 4090) while outperforming several lightweight baselines; it also introduces the Synthetic-Pop benchmark. The work offers a practical, model-agnostic detector that generalizes well to high-fidelity synthetic images, with potential impact on content authentication and forgery mitigation.
Abstract
The increasing realism of synthetic images generated by advanced models such as VAEs, GANs, and LDMs poses significant challenges for synthetic image detection. To address this issue, we explore two artifact types introduced during the generation process: (1) latent distribution deviations and (2) decoding-induced smoothing effects, which manifest as inconsistencies in local textures, edges, and color transitions. Leveraging local pixel dependencies (LPD) properties rooted in Markov Random Fields, we reconstruct synthetic images using neighboring pixel information to expose disruptions in texture continuity and edge coherence. Building upon LPD, we propose FerretNet, a lightweight neural network with only 1.1M parameters that delivers efficient and robust synthetic image detection. Extensive experiments demonstrate that FerretNet, trained exclusively on the 4-class ProGAN dataset, achieves an average accuracy of 97.1% on an open-world benchmark comprising 22 generative models. Our code and datasets are publicly available at https://github.com/xigua7105/FerretNet.
