Table of Contents
Fetching ...

Training-free LLM-generated Text Detection by Mining Token Probability Sequences

Yihuai Xu, Yongwei Wang, Yifei Bi, Huangsen Cao, Zhouhan Lin, Yu Zhao, Fei Wu

TL;DR

This work tackles the challenge of cross-domain LLM-generated text detection by proposing Lastde, a training-free detector that reframes token probability sequences (TPS) as time-series data. It introduces multiscale diversity entropy (MDE) to capture local TPS dynamics and combines it with global log-likelihood to form a robust score, with Lastde++ offering a fast-sampling variant for real-time use. Across six datasets and multiple source/proxy-model configurations, Lastde and Lastde++ achieve state-of-the-art performance among training-free detectors, demonstrating strong robustness to paraphrasing attacks and cross-lingual scenarios while maintaining competitive efficiency. The approach has practical implications for scalable, model-agnostic detection of machine-generated text in diverse real-world settings, including cross-language and cross-model contexts.

Abstract

Large language models (LLMs) have demonstrated remarkable capabilities in generating high-quality texts across diverse domains. However, the potential misuse of LLMs has raised significant concerns, underscoring the urgent need for reliable detection of LLM-generated texts. Conventional training-based detectors often struggle with generalization, particularly in cross-domain and cross-model scenarios. In contrast, training-free methods, which focus on inherent discrepancies through carefully designed statistical features, offer improved generalization and interpretability. Despite this, existing training-free detection methods typically rely on global text sequence statistics, neglecting the modeling of local discriminative features, thereby limiting their detection efficacy. In this work, we introduce a novel training-free detector, termed \textbf{Lastde} that synergizes local and global statistics for enhanced detection. For the first time, we introduce time series analysis to LLM-generated text detection, capturing the temporal dynamics of token probability sequences. By integrating these local statistics with global ones, our detector reveals significant disparities between human and LLM-generated texts. We also propose an efficient alternative, \textbf{Lastde++} to enable real-time detection. Extensive experiments on six datasets involving cross-domain, cross-model, and cross-lingual detection scenarios, under both white-box and black-box settings, demonstrated that our method consistently achieves state-of-the-art performance. Furthermore, our approach exhibits greater robustness against paraphrasing attacks compared to existing baseline methods.

Training-free LLM-generated Text Detection by Mining Token Probability Sequences

TL;DR

This work tackles the challenge of cross-domain LLM-generated text detection by proposing Lastde, a training-free detector that reframes token probability sequences (TPS) as time-series data. It introduces multiscale diversity entropy (MDE) to capture local TPS dynamics and combines it with global log-likelihood to form a robust score, with Lastde++ offering a fast-sampling variant for real-time use. Across six datasets and multiple source/proxy-model configurations, Lastde and Lastde++ achieve state-of-the-art performance among training-free detectors, demonstrating strong robustness to paraphrasing attacks and cross-lingual scenarios while maintaining competitive efficiency. The approach has practical implications for scalable, model-agnostic detection of machine-generated text in diverse real-world settings, including cross-language and cross-model contexts.

Abstract

Large language models (LLMs) have demonstrated remarkable capabilities in generating high-quality texts across diverse domains. However, the potential misuse of LLMs has raised significant concerns, underscoring the urgent need for reliable detection of LLM-generated texts. Conventional training-based detectors often struggle with generalization, particularly in cross-domain and cross-model scenarios. In contrast, training-free methods, which focus on inherent discrepancies through carefully designed statistical features, offer improved generalization and interpretability. Despite this, existing training-free detection methods typically rely on global text sequence statistics, neglecting the modeling of local discriminative features, thereby limiting their detection efficacy. In this work, we introduce a novel training-free detector, termed \textbf{Lastde} that synergizes local and global statistics for enhanced detection. For the first time, we introduce time series analysis to LLM-generated text detection, capturing the temporal dynamics of token probability sequences. By integrating these local statistics with global ones, our detector reveals significant disparities between human and LLM-generated texts. We also propose an efficient alternative, \textbf{Lastde++} to enable real-time detection. Extensive experiments on six datasets involving cross-domain, cross-model, and cross-lingual detection scenarios, under both white-box and black-box settings, demonstrated that our method consistently achieves state-of-the-art performance. Furthermore, our approach exhibits greater robustness against paraphrasing attacks compared to existing baseline methods.
Paper Structure (24 sections, 11 equations, 12 figures, 13 tables)

This paper contains 24 sections, 11 equations, 12 figures, 13 tables.

Figures (12)

  • Figure 1: Comparison of TPS fluctuations and Lastde score distributions between human-written texts and LLM-generated texts, using the first 30 tokens of human texts as a prefix to continue writing with OPT-2.7.
  • Figure 2: Overview of the Lastde detection framework. The example in the figure shows how a text consisting of 7 tokens can be completely detected under the setting of $s=3,\varepsilon=10,\tau^{\prime}=3$. First, the candidate text is transformed into a token (log) probability sequence (TPS) via inference by a proxy model. Then, both global and local statistics of the TPS are mined in parallel. In particular, the TPS is mapped into 3 new sequences by a Multi-scale Processor. Taking $\tau=3$ as an example, TP$i,j,k$ represents the mean of three consecutive elements $i, j, k$ from the original TPS. Next, we apply sliding windows to segment and rearrange all new sequences, calculating cosine similarity histograms to derive the Diversity Entropy (DE) of each new sequence. Finally, we divide Log-Likelihood by aggregating all DEs (Agg-MDE) to derive the detection score, making a decision based on an appropriate threshold.
  • Figure 3: Sensitivity analysis results for three types of hyperparameters in Lastde++. The legend denotes (Dataset, Source Model). Through preliminary analysis and referring to the parameter tuning experiments of the original paper of the MDE algorithm, we explore the following ranges: $s \in \{2, 3, 4, 5, 6\}$, $\varepsilon \in \{n, 4n, 6n, 8n, 10n\}$, $\tau^{\prime} \in \{5, 10, 15, 20, 25\}$. In each experiment, we adjust only one type of hyperparameters while keeping the other two types fixed at their default settings.
  • Figure 4: Distribution-based methods' robustness to contrast sample numbers. The right y-axis represents the AUROC of fast sampling methods, while the left y-axis represents non-fast sampling methods (see Appendix \ref{['appendix:Samples Number']}). Perform white-box detection using Gemma-7 as the source model.
  • Figure 5: Detection results of 6 detection methods on 5 response lengths. Specifically, the 3 hyperparameters of Lastde++ and Lastde were set to $s=3, \varepsilon=1\cdot n, \tau^{\prime}=10$ to adapt to shorter text. The settings for the other methods were kept at their default settings.
  • ...and 7 more figures