Table of Contents
Fetching ...

Variation is the Key: A Variation-Based Framework for LLM-Generated Text Detection

Xuecong Li, Xiaohong Li, Qiang Hu, Yao Zhang, Junjie Wang

TL;DR

VaryBalance introduces a practical black-box detector for LLM-generated text by exploiting the larger MSD between human text and its LLM rewrites compared to machine text. It generates multiple rewrites via a rewriter, scores original and rewritten texts with log PPL using a small scoring model, and combines these signals into a final exponential score that separates human from LLM-generated content. Extensive experiments across benchmarks, robustness datasets, and multilingual scenarios show substantial AUROC gains (up to approximately $34.5\%$) over state-of-the-art detectors and strong robustness to model, genre, and language variations. The approach offers a model-agnostic, scalable solution for real-world deployment, with an extended variant improving performance on social-media text.

Abstract

Detecting text generated by large language models (LLMs) is crucial but challenging. Existing detectors depend on impractical assumptions, such as white-box settings, or solely rely on text-level features, leading to imprecise detection ability. In this paper, we propose a simple but effective and practical LLM-generated text detection method, VaryBalance. The core of VaryBalance is that, compared to LLM-generated texts, there is a greater difference between human texts and their rewritten version via LLMs. Leveraging this observation, VaryBalance quantifies this through mean standard deviation and distinguishes human texts and LLM-generated texts. Comprehensive experiments demonstrated that VaryBalance outperforms the state-of-the-art detectors, i.e., Binoculars, by up to 34.3\% in terms of AUROC, and maintains robustness against multiple generating models and languages.

Variation is the Key: A Variation-Based Framework for LLM-Generated Text Detection

TL;DR

VaryBalance introduces a practical black-box detector for LLM-generated text by exploiting the larger MSD between human text and its LLM rewrites compared to machine text. It generates multiple rewrites via a rewriter, scores original and rewritten texts with log PPL using a small scoring model, and combines these signals into a final exponential score that separates human from LLM-generated content. Extensive experiments across benchmarks, robustness datasets, and multilingual scenarios show substantial AUROC gains (up to approximately ) over state-of-the-art detectors and strong robustness to model, genre, and language variations. The approach offers a model-agnostic, scalable solution for real-world deployment, with an extended variant improving performance on social-media text.

Abstract

Detecting text generated by large language models (LLMs) is crucial but challenging. Existing detectors depend on impractical assumptions, such as white-box settings, or solely rely on text-level features, leading to imprecise detection ability. In this paper, we propose a simple but effective and practical LLM-generated text detection method, VaryBalance. The core of VaryBalance is that, compared to LLM-generated texts, there is a greater difference between human texts and their rewritten version via LLMs. Leveraging this observation, VaryBalance quantifies this through mean standard deviation and distinguishes human texts and LLM-generated texts. Comprehensive experiments demonstrated that VaryBalance outperforms the state-of-the-art detectors, i.e., Binoculars, by up to 34.3\% in terms of AUROC, and maintains robustness against multiple generating models and languages.
Paper Structure (26 sections, 5 equations, 8 figures, 4 tables)

This paper contains 26 sections, 5 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Comparison of Human/LLM texts and their rewritten texts.
  • Figure 2: Overall workflow of VaryBalance. We first prompt an LLM to rewrite the text, then utilize a scoring model to calculate $\mathrm{log PPL}$ and the VaryBalance score of each text piece. The greater VaryBalance score indicates the more possibility of human text.
  • Figure 3: The MSD of human text, machine text, human text rewritten by LLM ,and machine text rewritten by LLM, Human text indicates texts are written by humans, the logic suits for machine text etc. The blue line represents human text.
  • Figure 4: The ROC curve of VaryBalance and the other four baselines on the benchmark datasets. Upper: GPT as the source model; Lower: Claude as the source model.
  • Figure 5: The comparison of score distribution of VaryBalance on Essay Forum dataset.
  • ...and 3 more figures