Table of Contents
Fetching ...

UniShield: An Adaptive Multi-Agent Framework for Unified Forgery Image Detection and Localization

Qing Huang, Zhipei Xu, Xuanyu Zhang, Jian Zhang

TL;DR

UniShield addresses the challenge of cross-domain forgery image detection and localization by introducing a unified, adaptive framework built from a perception agent and a detection agent. It automatically routes each image to appropriate detectors across IMDL, DMDL, DFD, and AIGCD and outputs structured, interpretable reports. Empirical results demonstrate state-of-the-art performance on both cross-domain benchmarks and domain-specific tasks, highlighting strong generalization and practicality. The approach has significant implications for forensic investigations, content verification, and safeguarding public trust in AI-generated media.

Abstract

With the rapid advancements in image generation, synthetic images have become increasingly realistic, posing significant societal risks, such as misinformation and fraud. Forgery Image Detection and Localization (FIDL) thus emerges as essential for maintaining information integrity and societal security. Despite impressive performances by existing domain-specific detection methods, their practical applicability remains limited, primarily due to their narrow specialization, poor cross-domain generalization, and the absence of an integrated adaptive framework. To address these issues, we propose UniShield, the novel multi-agent-based unified system capable of detecting and localizing image forgeries across diverse domains, including image manipulation, document manipulation, DeepFake, and AI-generated images. UniShield innovatively integrates a perception agent with a detection agent. The perception agent intelligently analyzes image features to dynamically select suitable detection models, while the detection agent consolidates various expert detectors into a unified framework and generates interpretable reports. Extensive experiments show that UniShield achieves state-of-the-art results, surpassing both existing unified approaches and domain-specific detectors, highlighting its superior practicality, adaptiveness, and scalability.

UniShield: An Adaptive Multi-Agent Framework for Unified Forgery Image Detection and Localization

TL;DR

UniShield addresses the challenge of cross-domain forgery image detection and localization by introducing a unified, adaptive framework built from a perception agent and a detection agent. It automatically routes each image to appropriate detectors across IMDL, DMDL, DFD, and AIGCD and outputs structured, interpretable reports. Empirical results demonstrate state-of-the-art performance on both cross-domain benchmarks and domain-specific tasks, highlighting strong generalization and practicality. The approach has significant implications for forensic investigations, content verification, and safeguarding public trust in AI-generated media.

Abstract

With the rapid advancements in image generation, synthetic images have become increasingly realistic, posing significant societal risks, such as misinformation and fraud. Forgery Image Detection and Localization (FIDL) thus emerges as essential for maintaining information integrity and societal security. Despite impressive performances by existing domain-specific detection methods, their practical applicability remains limited, primarily due to their narrow specialization, poor cross-domain generalization, and the absence of an integrated adaptive framework. To address these issues, we propose UniShield, the novel multi-agent-based unified system capable of detecting and localizing image forgeries across diverse domains, including image manipulation, document manipulation, DeepFake, and AI-generated images. UniShield innovatively integrates a perception agent with a detection agent. The perception agent intelligently analyzes image features to dynamically select suitable detection models, while the detection agent consolidates various expert detectors into a unified framework and generates interpretable reports. Extensive experiments show that UniShield achieves state-of-the-art results, surpassing both existing unified approaches and domain-specific detectors, highlighting its superior practicality, adaptiveness, and scalability.

Paper Structure

This paper contains 18 sections, 4 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: Unified Framework for Forgery Image Detection and Localization. (a) Various types of forgery images (b) Previous methods required users to manually determine the detection tools based on image content, often leading to confusion. (c) Our system, UniShield, automatically coordinates various detection tools for efficient, unified forgery detection, and outputs interpretable reports including description, detection, localization, and judgment basis.
  • Figure 2: The pipeline of UniShield. Our method consists of two main components: Perception Agent and Detection Agent. The Perception Agent includes a task router that determines the forgery domain and a tool scheduler that selects the appropriate detector type based on image content. The Detection Agent then performs fake detection using the selected expert tool and generates a structured report, including detection result and the reasoning behind the judgment.
  • Figure 3: Pilot study comparing LLM-based and non-LLM-based detectors on different types of forgeries. The LLM-based model FakeShield performs better on semantic forgeries, while the non-LLM-based method IML-ViT excels at detecting low-level artifact-based manipulations.
  • Figure 4: Illustration of the forgery report of our UniShield.