Table of Contents
Fetching ...

DART: An AIGT Detector using AMR of Rephrased Text

Hyeonchu Park, Byungjun Kim, Bugeun Kim

TL;DR

DART introduces a rephrasing-driven, AMR-based detector for AI-generated text (AIGT) that does not rely on probabilistic features and can discriminate among multiple black-box LLMs, as well as humans, even when origins are unknown. The framework uses four steps—rephrasing with a strong rephraser, AMR-based semantic parsing, gap scoring with $p_i$ and $r_i$, and a simple classifier—to form a semantic feature vector $v$ that feeds an interpretable model. Across single-candidate, multi-candidate, and leave-one-out experiments on diverse domains (XSum, SQuAD, Reddit, PubMedQA) and four cutting-edge LLMs, DART outperforms baselines with substantial gains in $F_1$-score (e.g., average $F_1$ improvements of ~19 percentage points in some settings) and demonstrates notable generalization to unseen origins. The work underscores the practical potential of AMR-based semantics for robust, real-world AIGT detection, while acknowledging limitations related to rephraser choice, AMR parsing accuracy, and the scope of models tested.

Abstract

As large language models (LLMs) generate more human-like texts, concerns about the side effects of AI-generated texts (AIGT) have grown. So, researchers have developed methods for detecting AIGT. However, two challenges remain. First, the performance of detecting black-box LLMs is low because existing models focus on probabilistic features. Second, most AIGT detectors have been tested on a single-candidate setting, which assumes that we know the origin of an AIGT and which may deviate from the real-world scenario. To resolve these challenges, we propose DART, which consists of four steps: rephrasing, semantic parsing, scoring, and multiclass classification. We conducted three experiments to test the performance of DART. The experimental result shows that DART can discriminate multiple black-box LLMs without probabilistic features and the origin of AIGT.

DART: An AIGT Detector using AMR of Rephrased Text

TL;DR

DART introduces a rephrasing-driven, AMR-based detector for AI-generated text (AIGT) that does not rely on probabilistic features and can discriminate among multiple black-box LLMs, as well as humans, even when origins are unknown. The framework uses four steps—rephrasing with a strong rephraser, AMR-based semantic parsing, gap scoring with and , and a simple classifier—to form a semantic feature vector that feeds an interpretable model. Across single-candidate, multi-candidate, and leave-one-out experiments on diverse domains (XSum, SQuAD, Reddit, PubMedQA) and four cutting-edge LLMs, DART outperforms baselines with substantial gains in -score (e.g., average improvements of ~19 percentage points in some settings) and demonstrates notable generalization to unseen origins. The work underscores the practical potential of AMR-based semantics for robust, real-world AIGT detection, while acknowledging limitations related to rephraser choice, AMR parsing accuracy, and the scope of models tested.

Abstract

As large language models (LLMs) generate more human-like texts, concerns about the side effects of AI-generated texts (AIGT) have grown. So, researchers have developed methods for detecting AIGT. However, two challenges remain. First, the performance of detecting black-box LLMs is low because existing models focus on probabilistic features. Second, most AIGT detectors have been tested on a single-candidate setting, which assumes that we know the origin of an AIGT and which may deviate from the real-world scenario. To resolve these challenges, we propose DART, which consists of four steps: rephrasing, semantic parsing, scoring, and multiclass classification. We conducted three experiments to test the performance of DART. The experimental result shows that DART can discriminate multiple black-box LLMs without probabilistic features and the origin of AIGT.

Paper Structure

This paper contains 35 sections, 6 figures, 7 tables.

Figures (6)

  • Figure 1: The DART framework
  • Figure 2: Contingency matrix from multi-candidate experiment. Top (a) and Bottom (b) correspond to SeqXGPT and DARTDT. Actual and predicted classes are depicted as horizontal and vertical axes.
  • Figure 3: F1 score of detectors when we decrease the amount of training data in multi-candidate experiment.
  • Figure 4: PCA Plot between the first principal component and the second
  • Figure 5: PCA Plot between the first principal component and the third
  • ...and 1 more figures