Table of Contents
Fetching ...

PatchCraft: Exploring Texture Patch for Efficient AI-generated Image Detection

Nan Zhong, Yiran Xu, Sheng Li, Zhenxing Qian, Xinpeng Zhang

TL;DR

The paper tackles the challenge of detecting AI-generated images across unseen generative models by focusing on texture-based fingerprints. It introduces PatchCraft, which applies Smash&Reconstruction to suppress semantic content and leverages the inter-pixel correlation contrast between rich and poor texture regions, amplified by SRM high-pass filters and a lightweight classifier. A comprehensive benchmark spanning 17 generative models demonstrates that PatchCraft generalizes better than state-of-the-art baselines, including under common distortions. The approach offers a robust, size-insensitive fingerprinting method with practical implications for trustworthy AI-generated image detection in real-world settings.

Abstract

Recent generative models show impressive performance in generating photographic images. Humans can hardly distinguish such incredibly realistic-looking AI-generated images from real ones. AI-generated images may lead to ubiquitous disinformation dissemination. Therefore, it is of utmost urgency to develop a detector to identify AI generated images. Most existing detectors suffer from sharp performance drops over unseen generative models. In this paper, we propose a novel AI-generated image detector capable of identifying fake images created by a wide range of generative models. We observe that the texture patches of images tend to reveal more traces left by generative models compared to the global semantic information of the images. A novel Smash&Reconstruction preprocessing is proposed to erase the global semantic information and enhance texture patches. Furthermore, pixels in rich texture regions exhibit more significant fluctuations than those in poor texture regions. Synthesizing realistic rich texture regions proves to be more challenging for existing generative models. Based on this principle, we leverage the inter-pixel correlation contrast between rich and poor texture regions within an image to further boost the detection performance. In addition, we build a comprehensive AI-generated image detection benchmark, which includes 17 kinds of prevalent generative models, to evaluate the effectiveness of existing baselines and our approach. Our benchmark provides a leaderboard for follow-up studies. Extensive experimental results show that our approach outperforms state-of-the-art baselines by a significant margin. Our project: https://fdmas.github.io/AIGCDetect

PatchCraft: Exploring Texture Patch for Efficient AI-generated Image Detection

TL;DR

The paper tackles the challenge of detecting AI-generated images across unseen generative models by focusing on texture-based fingerprints. It introduces PatchCraft, which applies Smash&Reconstruction to suppress semantic content and leverages the inter-pixel correlation contrast between rich and poor texture regions, amplified by SRM high-pass filters and a lightweight classifier. A comprehensive benchmark spanning 17 generative models demonstrates that PatchCraft generalizes better than state-of-the-art baselines, including under common distortions. The approach offers a robust, size-insensitive fingerprinting method with practical implications for trustworthy AI-generated image detection in real-world settings.

Abstract

Recent generative models show impressive performance in generating photographic images. Humans can hardly distinguish such incredibly realistic-looking AI-generated images from real ones. AI-generated images may lead to ubiquitous disinformation dissemination. Therefore, it is of utmost urgency to develop a detector to identify AI generated images. Most existing detectors suffer from sharp performance drops over unseen generative models. In this paper, we propose a novel AI-generated image detector capable of identifying fake images created by a wide range of generative models. We observe that the texture patches of images tend to reveal more traces left by generative models compared to the global semantic information of the images. A novel Smash&Reconstruction preprocessing is proposed to erase the global semantic information and enhance texture patches. Furthermore, pixels in rich texture regions exhibit more significant fluctuations than those in poor texture regions. Synthesizing realistic rich texture regions proves to be more challenging for existing generative models. Based on this principle, we leverage the inter-pixel correlation contrast between rich and poor texture regions within an image to further boost the detection performance. In addition, we build a comprehensive AI-generated image detection benchmark, which includes 17 kinds of prevalent generative models, to evaluate the effectiveness of existing baselines and our approach. Our benchmark provides a leaderboard for follow-up studies. Extensive experimental results show that our approach outperforms state-of-the-art baselines by a significant margin. Our project: https://fdmas.github.io/AIGCDetect
Paper Structure (11 sections, 2 equations, 6 figures, 14 tables)

This paper contains 11 sections, 2 equations, 6 figures, 14 tables.

Figures (6)

  • Figure 1: We conduct a comprehensive AI-generated image detection benchmark, including 16 kinds of prevalent generative models karras2017progressivekarras2019stylebrock2018largezhu2017unpairedchoi2018starganpark2019semantickarras2020analyzingdhariwal2021diffusionnichol2021gliderombach2022highgu2022vector and commercial APIs wukongdall-e-2 like Midjourney midjourney. Multiple cutting-edge detectors wang2020cnnfrank2020leveragingju2022fusingliu2020globalliu2022detectingtan2023learningwang2023direojha2023towards are presented in the benchmark. We visualize the results with a radar chart. The concentric circles denote the detection accuracy. Our approach outperforms the state-of-the-art detector about 4% over average detection accuracy.
  • Figure 2: The illustration of our motivation.
  • Figure 3: The framework of our approach.
  • Figure 4: The illustration of Smash&Reconstruction.
  • Figure 5: The specific kernel parameters of high-pass filters.
  • ...and 1 more figures