Table of Contents
Fetching ...

Beyond Known Fakes: Generalized Detection of AI-Generated Images via Post-hoc Distribution Alignment

Li Wang, Wenyu Chen, Xiangtao Meng, Zheng Li, Shanqing Guo

TL;DR

The paper addresses the challenge of detecting AI-generated images in open-world settings where test-time generators are unknown. It introduces Post-hoc Distribution Alignment (PDA), a model-agnostic, training-free framework that uses test-time regeneration with a known generator to align real images with the known fake distribution, while unknown fakes remain misaligned. PDA achieves a high average detection accuracy of $96.69\%$ across 16 diverse generators and datasets, significantly outperforming baselines such as DRCT by about $10.71\%$, and exhibits robustness to distribution shifts, transformations, and real-image diversity. The approach is computationally efficient—relying on a single regeneration step and a KNN-based decision—and scalable to real-world deployment, with strong implications for authenticity verification and content moderation in rapidly evolving AI-generated content landscapes.

Abstract

The rapid proliferation of highly realistic AI-generated images poses serious security threats such as misinformation and identity fraud. Detecting generated images in open-world settings is particularly challenging when they originate from unknown generators, as existing methods typically rely on model-specific artifacts and require retraining on new fake data, limiting their generalization and scalability. In this work, we propose Post-hoc Distribution Alignment (PDA), a generalized and model-agnostic framework for detecting AI-generated images under unknown generative threats. Specifically, PDA reformulates detection as a distribution alignment task by regenerating test images through a known generative model. When real images are regenerated, they inherit model-specific artifacts and align with the known fake distribution. In contrast, regenerated unknown fakes contain incompatible or mixed artifacts and remain misaligned. This difference allows an existing detector, trained on the known generative model, to accurately distinguish real images from unknown fakes without requiring access to unseen data or retraining. Extensive experiments across 16 state-of-the-art generative models, including GANs, diffusion models, and commercial text-to-image APIs (e.g., Midjourney), demonstrate that PDA achieves average detection accuracy of 96.69%, outperforming the best baseline by 10.71%. Comprehensive ablation studies and robustness analyses further confirm PDA's generalizability and resilience to distribution shifts and image transformations. Overall, our work provides a practical and scalable solution for real-world AI-generated image detection where new generative models emerge continuously.

Beyond Known Fakes: Generalized Detection of AI-Generated Images via Post-hoc Distribution Alignment

TL;DR

The paper addresses the challenge of detecting AI-generated images in open-world settings where test-time generators are unknown. It introduces Post-hoc Distribution Alignment (PDA), a model-agnostic, training-free framework that uses test-time regeneration with a known generator to align real images with the known fake distribution, while unknown fakes remain misaligned. PDA achieves a high average detection accuracy of across 16 diverse generators and datasets, significantly outperforming baselines such as DRCT by about , and exhibits robustness to distribution shifts, transformations, and real-image diversity. The approach is computationally efficient—relying on a single regeneration step and a KNN-based decision—and scalable to real-world deployment, with strong implications for authenticity verification and content moderation in rapidly evolving AI-generated content landscapes.

Abstract

The rapid proliferation of highly realistic AI-generated images poses serious security threats such as misinformation and identity fraud. Detecting generated images in open-world settings is particularly challenging when they originate from unknown generators, as existing methods typically rely on model-specific artifacts and require retraining on new fake data, limiting their generalization and scalability. In this work, we propose Post-hoc Distribution Alignment (PDA), a generalized and model-agnostic framework for detecting AI-generated images under unknown generative threats. Specifically, PDA reformulates detection as a distribution alignment task by regenerating test images through a known generative model. When real images are regenerated, they inherit model-specific artifacts and align with the known fake distribution. In contrast, regenerated unknown fakes contain incompatible or mixed artifacts and remain misaligned. This difference allows an existing detector, trained on the known generative model, to accurately distinguish real images from unknown fakes without requiring access to unseen data or retraining. Extensive experiments across 16 state-of-the-art generative models, including GANs, diffusion models, and commercial text-to-image APIs (e.g., Midjourney), demonstrate that PDA achieves average detection accuracy of 96.69%, outperforming the best baseline by 10.71%. Comprehensive ablation studies and robustness analyses further confirm PDA's generalizability and resilience to distribution shifts and image transformations. Overall, our work provides a practical and scalable solution for real-world AI-generated image detection where new generative models emerge continuously.

Paper Structure

This paper contains 36 sections, 11 equations, 11 figures, 9 tables, 1 algorithm.

Figures (11)

  • Figure 1: High-level illustration of PDA: Reals become distributionally aligned with known fakes through regeneration, while unknown fakes remain misaligned in the feature space.
  • Figure 2: The overall framework of our PDA. It consists of three key steps: 1) filtering out known fakes by measuring alignment with the known fake distribution in the raw feature space; 2) regenerating the remaining images—a mixture of real and unknown fake samples—using a known generator; and 3) distinguishing real images from unknown fakes in the regenerated feature space based on deep KNN distances and a threshold-based criterion.
  • Figure 3: T-SNE visualization. Rows correspond to raw and regenerated feature spaces (“Fake” denotes known fake distribution).
  • Figure 4: KNN distance distributions. Rows correspond to raw and regenerated feature spaces (“Fake” denotes known fake distribution).
  • Figure 5: The impact of activation pruning.
  • ...and 6 more figures