Table of Contents
Fetching ...

GPT-who: An Information Density-based Machine-Generated Text Detector

Saranya Venkatraman, Adaku Uchendu, Dongwon Lee

TL;DR

GPT-who introduces a psycholinguistically-inspired, UID-based detector for machine-generated text that uses a fixed set of surprisal-based features computed from an off-the-shelf language model and a logistic regression classifier. The method achieves strong cross-domain performance, outperforming statistical baselines by substantial margins and remaining computationally efficient without LM fine-tuning. UID features reveal distinct author signatures, with humans showing more non-uniform surprisal distributions and LM families exhibiting architecture-specific UID patterns. The approach offers interpretable, domain-agnostic detection with practical running times, and is released with accompanying UID measures and code.

Abstract

The Uniform Information Density (UID) principle posits that humans prefer to spread information evenly during language production. We examine if this UID principle can help capture differences between Large Language Models (LLMs)-generated and human-generated texts. We propose GPT-who, the first psycholinguistically-inspired domain-agnostic statistical detector. This detector employs UID-based features to model the unique statistical signature of each LLM and human author for accurate detection. We evaluate our method using 4 large-scale benchmark datasets and find that GPT-who outperforms state-of-the-art detectors (both statistical- & non-statistical) such as GLTR, GPTZero, DetectGPT, OpenAI detector, and ZeroGPT by over $20$% across domains. In addition to better performance, it is computationally inexpensive and utilizes an interpretable representation of text articles. We find that GPT-who can distinguish texts generated by very sophisticated LLMs, even when the overlying text is indiscernible. UID-based measures for all datasets and code are available at https://github.com/saranya-venkatraman/gpt-who.

GPT-who: An Information Density-based Machine-Generated Text Detector

TL;DR

GPT-who introduces a psycholinguistically-inspired, UID-based detector for machine-generated text that uses a fixed set of surprisal-based features computed from an off-the-shelf language model and a logistic regression classifier. The method achieves strong cross-domain performance, outperforming statistical baselines by substantial margins and remaining computationally efficient without LM fine-tuning. UID features reveal distinct author signatures, with humans showing more non-uniform surprisal distributions and LM families exhibiting architecture-specific UID patterns. The approach offers interpretable, domain-agnostic detection with practical running times, and is released with accompanying UID measures and code.

Abstract

The Uniform Information Density (UID) principle posits that humans prefer to spread information evenly during language production. We examine if this UID principle can help capture differences between Large Language Models (LLMs)-generated and human-generated texts. We propose GPT-who, the first psycholinguistically-inspired domain-agnostic statistical detector. This detector employs UID-based features to model the unique statistical signature of each LLM and human author for accurate detection. We evaluate our method using 4 large-scale benchmark datasets and find that GPT-who outperforms state-of-the-art detectors (both statistical- & non-statistical) such as GLTR, GPTZero, DetectGPT, OpenAI detector, and ZeroGPT by over % across domains. In addition to better performance, it is computationally inexpensive and utilizes an interpretable representation of text articles. We find that GPT-who can distinguish texts generated by very sophisticated LLMs, even when the overlying text is indiscernible. UID-based measures for all datasets and code are available at https://github.com/saranya-venkatraman/gpt-who.
Paper Structure (20 sections, 1 equation, 5 figures, 6 tables)

This paper contains 20 sections, 1 equation, 5 figures, 6 tables.

Figures (5)

  • Figure 1: GPT-who leverages psycholinguistically motivated representations that capture authors' information signatures distinctly, even when the corresponding text is indiscernible.
  • Figure 2: An example of UID span feature extraction that selects the most uniform and non-uniform segments from the token surprisal sequence. As can be seen in this example, two texts that read well can have very different underlying information density distributions in a given context. UID features capture these hidden statistical distinctions that are not apparent in their textual form.
  • Figure 3: GPT-who uses token probabilities of articles to extract UID-based features. A classifier then learns to map UID features to different authors, and identify the author of a new unseen article.
  • Figure 4: Distribution of UID Scores of 20 authors from the TuringBench dataset grouped (dotted line) by architecture type. LMs that share architectures tend to distribute UID scores similarly.
  • Figure :