Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature
Guangsheng Bao, Yanbin Zhao, Zhiyang Teng, Linyi Yang, Yue Zhang
TL;DR
Fast-DetectGPT addresses the high computation cost of DetectGPT by using conditional probability curvature as a token-level, zero-shot detector signal. It replaces the perturbation-based step with independent token sampling and a streamlined scoring process, enabling a single forward pass under aligned sampling and scoring models. Across six datasets and multiple source models, it achieves about 75% relative AUROC improvement and a 340x speedup, performing robustly in white-box and black-box scenarios and in real-world GPT-3/ChatGPT/GPT-4 prompts. The approach demonstrates strong generalization across domains, decoding strategies, and languages, offering a scalable alternative for machine-generated text detection with practical deployment potential.
Abstract
Large language models (LLMs) have shown the ability to produce fluent and cogent content, presenting both productivity opportunities and societal risks. To build trustworthy AI systems, it is imperative to distinguish between machine-generated and human-authored content. The leading zero-shot detector, DetectGPT, showcases commendable performance but is marred by its intensive computational costs. In this paper, we introduce the concept of conditional probability curvature to elucidate discrepancies in word choices between LLMs and humans within a given context. Utilizing this curvature as a foundational metric, we present **Fast-DetectGPT**, an optimized zero-shot detector, which substitutes DetectGPT's perturbation step with a more efficient sampling step. Our evaluations on various datasets, source models, and test conditions indicate that Fast-DetectGPT not only surpasses DetectGPT by a relative around 75% in both the white-box and black-box settings but also accelerates the detection process by a factor of 340, as detailed in Table 1. See \url{https://github.com/baoguangsheng/fast-detect-gpt} for code, data, and results.
