Decoupling Content and Expression: Two-Dimensional Detection of AI-Generated Text
Guangsheng Bao, Lihua Rong, Yanbin Zhao, Qiji Zhou, Yue Zhang
TL;DR
This paper tackles the challenge of detecting AI participation in text across multiple risk levels by introducing HART, a hierarchical AI risk framework, and a novel 2D Detection Method that decouples content from language expression. By treating AI content and AI expression as separate signals, the authors show that content-based features are more robust to surface-level changes and adversarial edits, yielding substantial improvements on level-2 and level-1 detections and strong cross-language performance. The work contributes a comprehensive benchmark (HART) with diverse domains and languages, ablation analyses, and practical insights into feature choice, model impact, and data distribution effects, culminating in a significant push toward robust, unified AI-text detection. The combination of content and expression signals within the 2D framework achieves state-of-the-art results on RAID and demonstrates resilience against common detection attacks, with broad implications for policy, safety, and content moderation across multilingual contexts.
Abstract
The wide usage of LLMs raises critical requirements on detecting AI participation in texts. Existing studies investigate these detections in scattered contexts, leaving a systematic and unified approach unexplored. In this paper, we present HART, a hierarchical framework of AI risk levels, each corresponding to a detection task. To address these tasks, we propose a novel 2D Detection Method, decoupling a text into content and language expression. Our findings show that content is resistant to surface-level changes, which can serve as a key feature for detection. Experiments demonstrate that 2D method significantly outperforms existing detectors, achieving an AUROC improvement from 0.705 to 0.849 for level-2 detection and from 0.807 to 0.886 for RAID. We release our data and code at https://github.com/baoguangsheng/truth-mirror.
