Table of Contents
Fetching ...

Semi-Truths: A Large-Scale Dataset of AI-Augmented Images for Evaluating Robustness of AI-Generated Image detectors

Anisha Pal, Julia Kruk, Mansi Phute, Manognya Bhattaram, Diyi Yang, Duen Horng Chau, Judy Hoffman

TL;DR

SMI-TRUTHS is introduced, featuring 27,600 real images, 223,400 masks, and 1,472,700 AI-augmented images that feature targeted and localized perturbations produced using diverse augmentation techniques, diffusion models, and data distributions, suggesting that state-of-the-art detectors exhibit varying sensitivities to the types and degrees of perturbations, data distributions, and augmentation methods used.

Abstract

Text-to-image diffusion models have impactful applications in art, design, and entertainment, yet these technologies also pose significant risks by enabling the creation and dissemination of misinformation. Although recent advancements have produced AI-generated image detectors that claim robustness against various augmentations, their true effectiveness remains uncertain. Do these detectors reliably identify images with different levels of augmentation? Are they biased toward specific scenes or data distributions? To investigate, we introduce SEMI-TRUTHS, featuring 27,600 real images, 223,400 masks, and 1,472,700 AI-augmented images that feature targeted and localized perturbations produced using diverse augmentation techniques, diffusion models, and data distributions. Each augmented image is accompanied by metadata for standardized and targeted evaluation of detector robustness. Our findings suggest that state-of-the-art detectors exhibit varying sensitivities to the types and degrees of perturbations, data distributions, and augmentation methods used, offering new insights into their performance and limitations. The code for the augmentation and evaluation pipeline is available at https://github.com/J-Kruk/SemiTruths.

Semi-Truths: A Large-Scale Dataset of AI-Augmented Images for Evaluating Robustness of AI-Generated Image detectors

TL;DR

SMI-TRUTHS is introduced, featuring 27,600 real images, 223,400 masks, and 1,472,700 AI-augmented images that feature targeted and localized perturbations produced using diverse augmentation techniques, diffusion models, and data distributions, suggesting that state-of-the-art detectors exhibit varying sensitivities to the types and degrees of perturbations, data distributions, and augmentation methods used.

Abstract

Text-to-image diffusion models have impactful applications in art, design, and entertainment, yet these technologies also pose significant risks by enabling the creation and dissemination of misinformation. Although recent advancements have produced AI-generated image detectors that claim robustness against various augmentations, their true effectiveness remains uncertain. Do these detectors reliably identify images with different levels of augmentation? Are they biased toward specific scenes or data distributions? To investigate, we introduce SEMI-TRUTHS, featuring 27,600 real images, 223,400 masks, and 1,472,700 AI-augmented images that feature targeted and localized perturbations produced using diverse augmentation techniques, diffusion models, and data distributions. Each augmented image is accompanied by metadata for standardized and targeted evaluation of detector robustness. Our findings suggest that state-of-the-art detectors exhibit varying sensitivities to the types and degrees of perturbations, data distributions, and augmentation methods used, offering new insights into their performance and limitations. The code for the augmentation and evaluation pipeline is available at https://github.com/J-Kruk/SemiTruths.

Paper Structure

This paper contains 44 sections, 39 figures, 4 tables, 2 algorithms.

Figures (39)

  • Figure 1: Semi-Truths image augmentations that are measured by the size of the augmented region (Area Ratio) and the semantic change achieved (Semantic Magnitude), categorized into $3$ levels - small (col1), medium (col2), and large (col3).
  • Figure 2: End-to-end pipeline for Semi-Truths curation and detector stress testing. The Semi-Truths pipeline sources data from $6$ benchmarks and uses $2$ perturbation techniques to perturb images. These images undergo saliency checks, metric computation, and stress testing of detectors across our curated tests based on the computed change metrics.
  • Figure 3: Image Augmentation Pipeline. Components of the image augmentation process for Semi-Truths curation using inpainting and prompt-based-editing methods.
  • Figure 4: Semi-Truths details and metadata. Each augmented image in Semi-Truths is accompanied by metadata detailing properties related to the native data distribution, change magnitude (both area and semantics), and directional semantic edits. Attributes highlighted in yellow are novel contributions presented in this work.
  • Figure 5: Detectors are sensitive to semantic aspects of data distribution. The $3$ detectors, CrossEfficientViT, DE-FAKE and UniversalFakeDetect were evaluated across varying (a) data distribution, (b) scene complexity and (c) scene diversity.
  • ...and 34 more figures