Table of Contents
Fetching ...

Generalizable Detection of AI Generated Images with Large Models and Fuzzy Decision Tree

Fei Wu, Guanghao Ding, Zijian Niu, Zhenrui Wang, Lei Yang, Zhuosheng Zhang, Shilin Wang

Abstract

The malicious use and widespread dissemination of AI-generated images pose a serious threat to the authenticity of digital content. Existing detection methods exploit low-level artifacts left by common manipulation steps within the generation pipeline, but they often lack generalization due to model-specific overfitting. Recently, researchers have resorted to Multimodal Large Language Models (MLLMs) for AIGC detection, leveraging their high-level semantic reasoning and broad generalization capabilities. While promising, MLLMs lack the fine-grained perceptual sensitivity to subtle generation artifacts, making them inadequate as standalone detectors. To address this issue, we propose a novel AI-generated image detection framework that synergistically integrates lightweight artifact-aware detectors with MLLMs via a fuzzy decision tree. The decision tree treats the outputs of basic detectors as fuzzy membership values, enabling adaptive fusion of complementary cues from semantic and perceptual perspectives. Extensive experiments demonstrate that the proposed method achieves state-of-the-art accuracy and strong generalization across diverse generative models.

Generalizable Detection of AI Generated Images with Large Models and Fuzzy Decision Tree

Abstract

The malicious use and widespread dissemination of AI-generated images pose a serious threat to the authenticity of digital content. Existing detection methods exploit low-level artifacts left by common manipulation steps within the generation pipeline, but they often lack generalization due to model-specific overfitting. Recently, researchers have resorted to Multimodal Large Language Models (MLLMs) for AIGC detection, leveraging their high-level semantic reasoning and broad generalization capabilities. While promising, MLLMs lack the fine-grained perceptual sensitivity to subtle generation artifacts, making them inadequate as standalone detectors. To address this issue, we propose a novel AI-generated image detection framework that synergistically integrates lightweight artifact-aware detectors with MLLMs via a fuzzy decision tree. The decision tree treats the outputs of basic detectors as fuzzy membership values, enabling adaptive fusion of complementary cues from semantic and perceptual perspectives. Extensive experiments demonstrate that the proposed method achieves state-of-the-art accuracy and strong generalization across diverse generative models.

Paper Structure

This paper contains 21 sections, 3 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: An example where an MLLM fails to detect a fake image due to its insensitivity to fine-grained visual artifacts.
  • Figure 2: An overview of our proposed method, which integrates lightweight detectors and multimodal large language models via a fuzzy decision tree.
  • Figure 3: An example of the structured prompt used for MLLMs
  • Figure 4: Accuracy of Qwen3-VL-32B-Instruct with different prompts. Each cell is subdivided into four subcells, representing four output prompts.
  • Figure 5: Visualized structure of the fuzzy decision tree with inference paths for two representative fake image examples. Each internal node is parameterized by a fuzzy predicate $(\mathcal{S}, \phi, \tau)$, where a selected subset of detectors $\mathcal{S}$ is fused via an ensemble operator $\phi$ and compared against a threshold $\tau$.