SpecDetect: Simple, Fast, and Training-Free Detection of LLM-Generated Text via Spectral Analysis

Haitong Luo; Weiyao Zhang; Suhang Wang; Wenji Zou; Chungang Lin; Xuying Meng; Yujun Zhang

SpecDetect: Simple, Fast, and Training-Free Detection of LLM-Generated Text via Spectral Analysis

Haitong Luo, Weiyao Zhang, Suhang Wang, Wenji Zou, Chungang Lin, Xuying Meng, Yujun Zhang

TL;DR

SpecDetect reframes LLM-generated text detection as a frequency-domain signal-processing problem by treating the token log-probability sequence as a zero-mean signal and analyzing its spectrum. It identifies DFT Total Energy $E_{DFT}$ as a single, hyperparameter-free, robust discriminant, with SpecDetect++ further boosting robustness via sampling-discrepancy. Empirical results across diverse datasets, models, and decoding strategies show state-of-the-art performance and favorable efficiency, including strong robustness to paraphrasing, varying text lengths, and cross-model/non-English generalization. The work demonstrates that classical signal processing can provide a simple, interpretable, and highly effective pathway for detecting LLM-generated text in practical settings.

Abstract

The proliferation of high-quality text from Large Language Models (LLMs) demands reliable and efficient detection methods. While existing training-free approaches show promise, they often rely on surface-level statistics and overlook fundamental signal properties of the text generation process. In this work, we reframe detection as a signal processing problem, introducing a novel paradigm that analyzes the sequence of token log-probabilities in the frequency domain. By systematically analyzing the signal's spectral properties using the global Discrete Fourier Transform (DFT) and the local Short-Time Fourier Transform (STFT), we find that human-written text consistently exhibits significantly higher spectral energy. This higher energy reflects the larger-amplitude fluctuations inherent in human writing compared to the suppressed dynamics of LLM-generated text. Based on this key insight, we construct SpecDetect, a detector built on a single, robust feature from the global DFT: DFT total energy. We also propose an enhanced version, SpecDetect++, which incorporates a sampling discrepancy mechanism to further boost robustness. Extensive experiments show that our approach outperforms the state-of-the-art model while running in nearly half the time. Our work introduces a new, efficient, and interpretable pathway for LLM-generated text detection, showing that classical signal processing techniques offer a surprisingly powerful solution to this modern challenge.

SpecDetect: Simple, Fast, and Training-Free Detection of LLM-Generated Text via Spectral Analysis

TL;DR

Abstract

SpecDetect: Simple, Fast, and Training-Free Detection of LLM-Generated Text via Spectral Analysis

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)