Table of Contents
Fetching ...

EIRES:Training-free AI-Generated Image Detection via Edit-Induced Reconstruction Error Shift

Wan Jiang, Jing Yan, Xiaojing Chen, Lin Shen, Chenhao Lin, Yunfeng Diao, Richang Hong

TL;DR

EIRES introduces a training-free detector that exploits edit-induced reconstruction error shifts to distinguish real from AI-generated images. By applying structured edits and measuring the maximal change in a perceptual reconstruction error, EIRES achieves strong zero-shot performance across diverse generators and remains robust under common post-processing. The approach is underpinned by a geometric lower bound tied to the decoder Jacobian, explaining why real images react differently from generated ones to edits. Extensive experiments on GenImage illustrate superior generalization and stability compared with both training-based and training-free baselines, highlighting practical applicability in open-world content authentication.

Abstract

Diffusion models have recently achieved remarkable photorealism, making it increasingly difficult to distinguish real images from generated ones, raising significant privacy and security concerns. In response, we present a key finding: structural edits enhance the reconstruction of real images while degrading that of generated images, creating a distinctive edit-induced reconstruction error shift. This asymmetric shift enhances the separability between real and generated images. Building on this insight, we propose EIRES, a training-free method that leverages structural edits to reveal inherent differences between real and generated images. To explain the discriminative power of this shift, we derive the reconstruction error lower bound under edit perturbations. Since EIRES requires no training, thresholding depends solely on the natural separability of the signal, where a larger margin yields more reliable detection. Extensive experiments show that EIRES is effective across diverse generative models and remains robust on the unbiased subset, even under post-processing operations.

EIRES:Training-free AI-Generated Image Detection via Edit-Induced Reconstruction Error Shift

TL;DR

EIRES introduces a training-free detector that exploits edit-induced reconstruction error shifts to distinguish real from AI-generated images. By applying structured edits and measuring the maximal change in a perceptual reconstruction error, EIRES achieves strong zero-shot performance across diverse generators and remains robust under common post-processing. The approach is underpinned by a geometric lower bound tied to the decoder Jacobian, explaining why real images react differently from generated ones to edits. Extensive experiments on GenImage illustrate superior generalization and stability compared with both training-based and training-free baselines, highlighting practical applicability in open-world content authentication.

Abstract

Diffusion models have recently achieved remarkable photorealism, making it increasingly difficult to distinguish real images from generated ones, raising significant privacy and security concerns. In response, we present a key finding: structural edits enhance the reconstruction of real images while degrading that of generated images, creating a distinctive edit-induced reconstruction error shift. This asymmetric shift enhances the separability between real and generated images. Building on this insight, we propose EIRES, a training-free method that leverages structural edits to reveal inherent differences between real and generated images. To explain the discriminative power of this shift, we derive the reconstruction error lower bound under edit perturbations. Since EIRES requires no training, thresholding depends solely on the natural separability of the signal, where a larger margin yields more reliable detection. Extensive experiments show that EIRES is effective across diverse generative models and remains robust on the unbiased subset, even under post-processing operations.

Paper Structure

This paper contains 29 sections, 6 theorems, 30 equations, 9 figures, 9 tables, 1 algorithm.

Key Result

Proposition 1

Let $E$ and $D$ be the encoder and decoder of a reconstruction model. For any image $x$ with projection $\tilde{x} \in \mathcal{M}$ and normal deviation $\varepsilon_\perp = x - \tilde{x}$, the reconstruction error satisfies: where $\kappa_D$ is the local condition number of the decoder Jacobian $J_D$ at $\tilde{x}$.

Figures (9)

  • Figure 1: Overview of EIRES. We aim to establish a boundary between real and generated images without requiring training, enabling the use of threshold-based methods to distinguish between them. The original image is first processed through an autoencoder to compute its reconstruction error. The image is then modified using a Multi- Edit module (ME), which applies a series of structured edits, including Add, Erase, and SemR. Editing operations are applied and the maximum change in reconstruction error before and after the edits is identified. This maximum error defines the Edit-Induced Reconstruction Error Shift score (EIRES). The image is classified as real or generated by comparing the EIRES score to a threshold determined through a real data validation set.
  • Figure 2: Geometric interpretation of off-manifold reconstruction. A real image $x_r$ is projected onto the reconstruction manifold at $\tilde{x}_r$, producing a normal residual $\varepsilon_\perp$ that is orthogonal to the local tangent space $T_{\tilde{x}_r}\mathcal{M}$. This residual is unavoidable for off-manifold inputs and induces a lower-bounded reconstruction error determined by the decoder Jacobian.
  • Figure 3: Detection robustness under real-world degradations. (a) AP under different crop ratios $f$. (b) AP under varying JPEG compression quality $q$.
  • Figure 4: Example of our method for detecting an image. On the left side of the figure, we show the LPIPS distance and visual results between the original image and the edited versions induced by structured edits, highlighting the controllable nature of the edits. It is important to note that in the actual detection process, calculating these distances is not necessary. The final detection score, EIRES, is derived from the maximum deviation after applying multiple edits.
  • Figure 5: Visualization of reconstruction behavior under the Add editing operation across different generative models. The Add operation is implemented using Add-it tewel2025addit based on FLUX.1, with a circular mask applied to the image and the prompt “Insert a small and brightly colored ball.” For each model, we show the input image, its autoencoder reconstruction, and the corresponding LPIPS heatmap of reconstruction error. Real images (leftmost) exhibit noticeably reduced reconstruction error after editing, whereas generated images (BigGAN, GLIDE, SD v1.4, SD v1.5, Wukong) display degraded or unstable reconstructions. This contrast illustrates the asymmetric edit-induced reconstruction shift that EIRES leverages for distinguishing real from generated images.
  • ...and 4 more figures

Theorems & Definitions (10)

  • Proposition 1
  • Proposition 2
  • Lemma 9.1
  • proof
  • Lemma 9.2
  • proof
  • Proposition 3: Detailed form of proposition 1 in the main paper
  • proof
  • Proposition 4: [Proposition 2 in the main paper]
  • proof