All Patches Matter, More Patches Better: Enhance AI-Generated Image Detection via Panoptic Patch Learning
Zheng Yang, Ruoxin Chen, Zhiyuan Yan, Ke-Yue Zhang, Xinghe Fu, Shuang Wu, Xiujun Shu, Taiping Yao, Shouhong Ding, Xi Li
TL;DR
This work targets the generalization challenge in detecting AI-generated images (AIGIs) by proposing Panoptic Patch Learning (PPL), which enforces use of information from all image patches. The framework combines Random Patch Replacement (RPR) to disrupt reliance on dominant patches and Patch-wise Contrastive Learning (PCL) to align patch representations across the image, promoting uniform patch utilization. The authors formalize principles of All Patches Matter and More Patches Better, diagnose Few-Patch Bias via Total Direct Effect (TDE) analyses, and demonstrate state-of-the-art performance across GenImage, DRCT, and Chameleon, with strong robustness to corruptions and masking. The approach offers practical gains for cross-generator generalization, enabling more reliable AIGI detection in the rapidly evolving landscape of generative models. Key ideas include leveraging distributed patch artifacts, mitigating lazy learning, and optimizing a combined loss that preserves discriminative power across all patches $\mathcal{L}_{total} = \lambda \mathcal{L}_{con} + (1-\lambda) \mathcal{L}_{ce}$, where $\mathcal{L}_{con}$ is a margin-based patch-wise contrastive loss and $TDE$ analyses reveal per-patch contributions to detection decisions.
Abstract
The exponential growth of AI-generated images (AIGIs) underscores the urgent need for robust and generalizable detection methods. In this paper, we establish two key principles for AIGI detection through systematic analysis: (1) All Patches Matter: Unlike conventional image classification where discriminative features concentrate on object-centric regions, each patch in AIGIs inherently contains synthetic artifacts due to the uniform generation process, suggesting that every patch serves as an important artifact source for detection. (2) More Patches Better: Leveraging distributed artifacts across more patches improves detection robustness by capturing complementary forensic evidence and reducing over-reliance on specific patches, thereby enhancing robustness and generalization. However, our counterfactual analysis reveals an undesirable phenomenon: naively trained detectors often exhibit a Few-Patch Bias, discriminating between real and synthetic images based on minority patches. We identify Lazy Learner as the root cause: detectors preferentially learn conspicuous artifacts in limited patches while neglecting broader artifact distributions. To address this bias, we propose the Panoptic Patch Learning (PPL) framework, involving: (1) Random Patch Replacement that randomly substitutes synthetic patches with real counterparts to compel models to identify artifacts in underutilized regions, encouraging the broader use of more patches; (2) Patch-wise Contrastive Learning that enforces consistent discriminative capability across all patches, ensuring uniform utilization of all patches. Extensive experiments across two different settings on several benchmarks verify the effectiveness of our approach.
