EvoGuard: An Extensible Agentic RL-based Framework for Practical and Evolving AI-Generated Image Detection

Chenyang Zhu; Maorong Wang; Jun Liu; Ching-Chun Chang; Isao Echizen

EvoGuard: An Extensible Agentic RL-based Framework for Practical and Evolving AI-Generated Image Detection

Chenyang Zhu, Maorong Wang, Jun Liu, Ching-Chun Chang, Isao Echizen

Abstract

The rapid proliferation of AI-Generated Images (AIGIs) has introduced severe risks of misinformation, making AIGI detection a critical yet challenging task. While traditional detection paradigms mainly rely on low-level features, recent research increasingly focuses on leveraging the general understanding ability of Multimodal Large Language Models (MLLMs) to achieve better generalization, but still suffer from limited extensibility and expensive training data annotations. To better address complex and dynamic real-world environments, we propose EvoGuard, a novel agentic framework for AIGI detection. It encapsulates various state-of-the-art (SOTA) off-the-shelf MLLM and non-MLLM detectors as callable tools, and coordinates them through a capability-aware dynamic orchestration mechanism. Empowered by the agent's capacities for autonomous planning and reflection, it intelligently selects suitable tools for given samples, reflects intermediate results, and decides the next action, reaching a final conclusion through multi-turn invocation and reasoning. This design effectively exploits the complementary strengths among heterogeneous detectors, transcending the limits of any single model. Furthermore, optimized by a GRPO-based Agentic Reinforcement Learning algorithm using only low-cost binary labels, it eliminates the reliance on fine-grained annotations. Extensive experiments demonstrate that EvoGuard achieves SOTA accuracy while mitigating the bias between positive and negative samples. More importantly, it allows the plug-and-play integration of new detectors to boost overall performance in a train-free manner, offering a highly practical, long-term solution to ever-evolving AIGI threats. Source code will be publicly available upon acceptance.

EvoGuard: An Extensible Agentic RL-based Framework for Practical and Evolving AI-Generated Image Detection

Abstract

Paper Structure (16 sections, 4 equations, 3 figures, 3 tables)

This paper contains 16 sections, 4 equations, 3 figures, 3 tables.

Introduction
Related Work
AI-Generated Image Detection
MLLM in Visual Forensics
Methodology
Overview
Capability-Aware Selection
Dynamic Orchestration
Agentic RL Training
Extend to Future Tools
Experiments
Experiment Setup
Performance Comparison
Extensibility to Future Tools
Ablation & Mechanism Analysis
...and 1 more sections

Figures (3)

Figure 1: Motivation of EvoGuard. As generative models rapidly evolve, AIGI detection methods must continuously evolve accordingly. Prior work has largely focused on continually building a stronger detector; we instead propose an agentic framework that encapsulates diverse detectors as tools and leverages an agent to schedule them for AIGI detection. This design exploits complementary strengths of heterogeneous detectors, enables training-free extensibility, and reduces reliance on fine-grained training data for MLLM-based methods.
Figure 2: Overview of EvoGuard. We wrap heterogeneous AIGI detectors as tools with capability profiles. Given a query image, the agent first performs Capability-Aware Selection by matching image tags with tool profiles to select suitable tools, then performs Dynamic Orchestration by iteratively analyzing tool outputs to decide whether to call additional tools for validation or to conclude a final detection result.
Figure 3: Results of tools (E/F/A), EvoGuard trained on different tool subsets then evaluated on the same tool subsets (E+F/F+A/E+F+A) and on full tool set (Extend to 4 Tools), and EvoGuard trained and evaluated on full tool set (E+F+M+A). The metric is Balanced Accuracy. Every group achieves a performance improvement through train-free extension, and its performance is very close to training on full tool set.

EvoGuard: An Extensible Agentic RL-based Framework for Practical and Evolving AI-Generated Image Detection

Abstract

EvoGuard: An Extensible Agentic RL-based Framework for Practical and Evolving AI-Generated Image Detection

Authors

Abstract

Table of Contents

Figures (3)