Detecting AI-Generated Text: Factors Influencing Detectability with Current Methods
Kathleen C. Fraser, Hillary Dawkins, Svetlana Kiritchenko
TL;DR
The paper surveys the landscape of AI-generated text detection, detailing watermarking, statistical/stylistic analyses, and LM-based classifiers while evaluating their strengths, weaknesses, and applicability across detection scenarios. It highlights the crucial role of dataset domain, language, and model characteristics in detector performance, and emphasizes the fragility of detectors to adversarial attacks and out-of-distribution data. The authors advocate for ensemble approaches, domain-aware training data, and human-in-the-loop strategies, while calling for multilingual, fair, and transparent detection frameworks. The work underscores the societal importance of robust AIGT detection amid rapidly evolving LLM capabilities and regulatory considerations, and outlines practical guidance for researchers and practitioners. It also identifies key gaps, such as cross-lingual generalization, unseen models, and multimodal detection, as fertile ground for future research.
Abstract
Large language models (LLMs) have advanced to a point that even humans have difficulty discerning whether a text was generated by another human, or by a computer. However, knowing whether a text was produced by human or artificial intelligence (AI) is important to determining its trustworthiness, and has applications in many domains including detecting fraud and academic dishonesty, as well as combating the spread of misinformation and political propaganda. The task of AI-generated text (AIGT) detection is therefore both very challenging, and highly critical. In this survey, we summarize state-of-the art approaches to AIGT detection, including watermarking, statistical and stylistic analysis, and machine learning classification. We also provide information about existing datasets for this task. Synthesizing the research findings, we aim to provide insight into the salient factors that combine to determine how "detectable" AIGT text is under different scenarios, and to make practical recommendations for future work towards this significant technical and societal challenge.
