Table of Contents
Fetching ...

AGENT-X: Adaptive Guideline-based Expert Network for Threshold-free AI-generated teXt detection

Jiatao Li, Mao Ye, Cheng Peng, Xunjian Yin, Xiaojun Wan

TL;DR

AGENT-X introduces a threshold-free, zero-shot AI-generated text detector built from a collaborative multi-agent system that leverages three linguistically grounded guideline dimensions: semantic, stylistic, and structural. A router selects relevant guidelines, base agents produce calibrated decisions with explicit reasoning, and a meta agent aggregates results into an interpretable final verdict. The framework emphasizes confidence steering to avoid external threshold tuning, achieving superior accuracy and generalization across diverse datasets and language models. This approach enhances interpretability and robustness in real-world detection tasks, with strong implications for combating misinformation and ensuring source authenticity.

Abstract

Existing AI-generated text detection methods heavily depend on large annotated datasets and external threshold tuning, restricting interpretability, adaptability, and zero-shot effectiveness. To address these limitations, we propose AGENT-X, a zero-shot multi-agent framework informed by classical rhetoric and systemic functional linguistics. Specifically, we organize detection guidelines into semantic, stylistic, and structural dimensions, each independently evaluated by specialized linguistic agents that provide explicit reasoning and robust calibrated confidence via semantic steering. A meta agent integrates these assessments through confidence-aware aggregation, enabling threshold-free, interpretable classification. Additionally, an adaptive Mixture-of-Agent router dynamically selects guidelines based on inferred textual characteristics. Experiments on diverse datasets demonstrate that AGENT-X substantially surpasses state-of-the-art supervised and zero-shot approaches in accuracy, interpretability, and generalization.

AGENT-X: Adaptive Guideline-based Expert Network for Threshold-free AI-generated teXt detection

TL;DR

AGENT-X introduces a threshold-free, zero-shot AI-generated text detector built from a collaborative multi-agent system that leverages three linguistically grounded guideline dimensions: semantic, stylistic, and structural. A router selects relevant guidelines, base agents produce calibrated decisions with explicit reasoning, and a meta agent aggregates results into an interpretable final verdict. The framework emphasizes confidence steering to avoid external threshold tuning, achieving superior accuracy and generalization across diverse datasets and language models. This approach enhances interpretability and robustness in real-world detection tasks, with strong implications for combating misinformation and ensuring source authenticity.

Abstract

Existing AI-generated text detection methods heavily depend on large annotated datasets and external threshold tuning, restricting interpretability, adaptability, and zero-shot effectiveness. To address these limitations, we propose AGENT-X, a zero-shot multi-agent framework informed by classical rhetoric and systemic functional linguistics. Specifically, we organize detection guidelines into semantic, stylistic, and structural dimensions, each independently evaluated by specialized linguistic agents that provide explicit reasoning and robust calibrated confidence via semantic steering. A meta agent integrates these assessments through confidence-aware aggregation, enabling threshold-free, interpretable classification. Additionally, an adaptive Mixture-of-Agent router dynamically selects guidelines based on inferred textual characteristics. Experiments on diverse datasets demonstrate that AGENT-X substantially surpasses state-of-the-art supervised and zero-shot approaches in accuracy, interpretability, and generalization.

Paper Structure

This paper contains 44 sections, 5 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Overview of AGENT-X. The router agent dynamically selects relevant linguistic agents, each generating robust, calibrated confidence scores via Confidence Steering. The meta agent aggregates these assessments to classify texts as human-written or AI-generated.
  • Figure 2: Example detection case of AGENT-X. The router agent dynamically activates relevant stylistic guidelines based on inferred textual characteristics. Each guideline-specific agent independently assesses the input text, providing calibrated confidence scores and detailed linguistic reasoning. The meta agent aggregates these evaluations via confidence-weighted voting, making the final interpretable classification decision.
  • Figure 3: Probability histogram and Expected Calibration Error (ECE) for (a)(c) LLM generated verbalised probability, (b)(d) modified probability using steered confidence calibration on GPT-4 generations. The red dashed and gray dashed represent Accuracy and Average Confidence respectively. The blue bars and yellow bars represent Predicted Accuracy per confidence bin and Miscalibration respectively.