HRR: Hierarchical Retrospection Refinement for Generated Image Detection
Peipei Yuan, Zijing Xie, Shuo Ye, Hong Chen, Yulong Wang
TL;DR
This paper tackles the challenge of reliably detecting generated images across diverse generative models and image scales. It introduces HRR, a diffusion-model-based framework with two core components: Multi-scale Style Retrospection (MSR) to produce scale-aware, style-robust features, and Additive Feature Refinement (AFR) to sparsely refine features via a correntropy-based loss. By combining KL divergence regularization and additive sparse optimization, HRR improves cross-generator generalization and scale-invariance, achieving state-of-the-art results on DIRE, ForenSynths, and cocoFake, with ablations confirming the complementary benefits of MSR and AFR. The approach offers a practical pathway to robust detection in real-world settings where manipulated imagery spans multiple generators and varying resolutions, and it provides insights into multi-scale, style-agnostic representations for forensic tasks.
Abstract
Generative artificial intelligence holds significant potential for abuse, and generative image detection has become a key focus of research. However, existing methods primarily focused on detecting a specific generative model and emphasizing the localization of synthetic regions, while neglecting the interference caused by image size and style on model learning. Our goal is to reach a fundamental conclusion: Is the image real or generated? To this end, we propose a diffusion model-based generative image detection framework termed Hierarchical Retrospection Refinement~(HRR). It designs a multi-scale style retrospection module that encourages the model to generate detailed and realistic multi-scale representations, while alleviating the learning biases introduced by dataset styles and generative models. Additionally, based on the principle of correntropy sparse additive machine, a feature refinement module is designed to reduce the impact of redundant features on learning and capture the intrinsic structure and patterns of the data, thereby improving the model's generalization ability. Extensive experiments demonstrate the HRR framework consistently delivers significant performance improvements, outperforming state-of-the-art methods in generated image detection task.
