Table of Contents
Fetching ...

HEDGE: Heterogeneous Ensemble for Detection of AI-GEnerated Images in the Wild

Fei Wu, Dagong Lu, Mufeng Yao, Xinlei Xu, Fengjun Guo

Abstract

Robust detection of AI-generated images in the wild remains challenging due to the rapid evolution of generative models and varied real-world distortions. We argue that relying on a single training regime, resolution, or backbone is insufficient to handle all conditions, and that structured heterogeneity across these dimensions is essential for robust detection. To this end, we propose HEDGE, a Heterogeneous Ensemble for Detection of AI-GEnerated images, that introduces complementary detection routes along three axes: diverse training data with strong augmentation, multi-scale feature extraction, and backbone heterogeneity. Specifically, Route~A progressively constructs DINOv3-based detectors through staged data expansion and augmentation escalation, Route~B incorporates a higher-resolution branch for fine-grained forensic cues, and Route~C adds a MetaCLIP2-based branch for backbone diversity. All outputs are fused via logit-space weighted averaging, refined by a lightweight dual-gating mechanism that handles branch-level outliers and majority-dominated fusion errors. HEDGE achieves 4th place in the NTIRE 2026 Robust AI-Generated Image Detection in the Wild Challenge and attains state-of-the-art performance with strong robustness on multiple AIGC image detection benchmarks.

HEDGE: Heterogeneous Ensemble for Detection of AI-GEnerated Images in the Wild

Abstract

Robust detection of AI-generated images in the wild remains challenging due to the rapid evolution of generative models and varied real-world distortions. We argue that relying on a single training regime, resolution, or backbone is insufficient to handle all conditions, and that structured heterogeneity across these dimensions is essential for robust detection. To this end, we propose HEDGE, a Heterogeneous Ensemble for Detection of AI-GEnerated images, that introduces complementary detection routes along three axes: diverse training data with strong augmentation, multi-scale feature extraction, and backbone heterogeneity. Specifically, Route~A progressively constructs DINOv3-based detectors through staged data expansion and augmentation escalation, Route~B incorporates a higher-resolution branch for fine-grained forensic cues, and Route~C adds a MetaCLIP2-based branch for backbone diversity. All outputs are fused via logit-space weighted averaging, refined by a lightweight dual-gating mechanism that handles branch-level outliers and majority-dominated fusion errors. HEDGE achieves 4th place in the NTIRE 2026 Robust AI-Generated Image Detection in the Wild Challenge and attains state-of-the-art performance with strong robustness on multiple AIGC image detection benchmarks.

Paper Structure

This paper contains 34 sections, 3 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: Overview of the proposed three-route framework for robust AIGC image detection. Route A progressively constructs DINOv3-based detectors through data expansion and augmentation escalation, Route B introduces a higher-resolution branch to capture fine-grained forensic cues, and Route C adds a MetaCLIP2-based branch for backbone heterogeneity. Their outputs are fused in the logit space and further refined by a dual-gating mechanism.
  • Figure 2: Robustness evaluation under common image perturbations on HiRes-50K (1,000 real + 1,000 fake, unseen during training). We report B.Acc under JPEG compression (left), spatial resizing (middle), and Gaussian blurring (right) at varying intensities. HEDGE maintains near-constant performance across all conditions.
  • Figure 3: t-SNE visualization of M3 (DINOv3-Huge) CLS token features on (a) GenImage and (b) Chameleon.