Table of Contents
Fetching ...

Detecting AI-Generated Images via Diffusion Snap-Back Reconstruction: A Forensic Approach

Mohd Ruhul Ameen, Akif Islam

TL;DR

This work addresses the challenge of distinguishing real images from AI-generated content produced by diffusion models. It introduces diffusion snap-back, a multi-strength img2img reconstruction framework that extracts both local perceptual metrics and global trajectory features from diffusion-based reconstructions, using a diffusion backbone (Stable Diffusion v1.5) and a lightweight classifier. The approach yields strong performance (AUROC up to $0.993$ in cross-validation and $0.990$ on holdout) and demonstrates interpretability through features like knee-step and LPIPS trajectories, with robustness to common distortions. By treating diffusion models as forensic sensors and leveraging manifold-based reconstruction dynamics, the method offers a scalable, model-agnostic signal for synthetic-media forensics with potential for public verification platforms.

Abstract

The rapid rise of generative diffusion models has made distinguishing authentic visual content from synthetic imagery increasingly challenging. Traditional deepfake detection methods, which rely on frequency or pixel-level artifacts, fail against modern text-to-image systems such as Stable Diffusion and DALL-E that produce photorealistic and artifact-free results. This paper introduces a diffusion-based forensic framework that leverages multi-strength image reconstruction dynamics, termed diffusion snap-back, to identify AI-generated images. By analysing how reconstruction metrics (LPIPS, SSIM, and PSNR) evolve across varying noise strengths, we extract interpretable manifold-based features that differentiate real and synthetic images. Evaluated on a balanced dataset of 4,000 images, our approach achieves 0.993 AUROC under cross-validation and remains robust to common distortions such as compression and noise. Despite using limited data and a single diffusion backbone (Stable Diffusion v1.5), the proposed method demonstrates strong generalization and interpretability, offering a foundation for scalable, model-agnostic synthetic media forensics.

Detecting AI-Generated Images via Diffusion Snap-Back Reconstruction: A Forensic Approach

TL;DR

This work addresses the challenge of distinguishing real images from AI-generated content produced by diffusion models. It introduces diffusion snap-back, a multi-strength img2img reconstruction framework that extracts both local perceptual metrics and global trajectory features from diffusion-based reconstructions, using a diffusion backbone (Stable Diffusion v1.5) and a lightweight classifier. The approach yields strong performance (AUROC up to in cross-validation and on holdout) and demonstrates interpretability through features like knee-step and LPIPS trajectories, with robustness to common distortions. By treating diffusion models as forensic sensors and leveraging manifold-based reconstruction dynamics, the method offers a scalable, model-agnostic signal for synthetic-media forensics with potential for public verification platforms.

Abstract

The rapid rise of generative diffusion models has made distinguishing authentic visual content from synthetic imagery increasingly challenging. Traditional deepfake detection methods, which rely on frequency or pixel-level artifacts, fail against modern text-to-image systems such as Stable Diffusion and DALL-E that produce photorealistic and artifact-free results. This paper introduces a diffusion-based forensic framework that leverages multi-strength image reconstruction dynamics, termed diffusion snap-back, to identify AI-generated images. By analysing how reconstruction metrics (LPIPS, SSIM, and PSNR) evolve across varying noise strengths, we extract interpretable manifold-based features that differentiate real and synthetic images. Evaluated on a balanced dataset of 4,000 images, our approach achieves 0.993 AUROC under cross-validation and remains robust to common distortions such as compression and noise. Despite using limited data and a single diffusion backbone (Stable Diffusion v1.5), the proposed method demonstrates strong generalization and interpretability, offering a foundation for scalable, model-agnostic synthetic media forensics.

Paper Structure

This paper contains 20 sections, 8 figures, 4 tables.

Figures (8)

  • Figure 1: High-level flow of the synthetic vs real image classification pipeline.
  • Figure 2: Top row: real (human-captured) images. Bottom row: AI-generated (synthetic) images from the dataset.
  • Figure 3: AI-generated example (chickpeas bowl). Progressive diffusion reconstructions at strengths $s=\{0.15,0.30,0.60,0.90\}$. The synthetic image remains visually consistent and semantically coherent even at $s=0.9$, showing smooth degradation characteristic of on-manifold behavior.
  • Figure 4: Human-captured example (hikers group photo). Authentic photographs exhibit strong off-manifold divergence at higher noise strengths— fine details and spatial coherence collapse rapidly beyond $s=0.6$, illustrating the knee-step degradation pattern typical of real images.
  • Figure 5: Evaluation metrics on the holdout set. (Left) ROC curve showing AUROC=$0.990$. (Middle) Reliability curve indicating close alignment with perfect calibration. (Right) Confusion matrix at $\theta^* = 0.914$ with minimal false positives/negatives.
  • ...and 3 more figures