Table of Contents
Fetching ...

Classification of Human- and AI-Generated Texts for English, French, German, and Spanish

Kristina Schaaff, Tim Schlippe, Lorenz Mindner

TL;DR

The paper tackles multilingual detection of AI-generated text by distinguishing human-written from AI-generated text (from scratch) and AI-rephrased text across English, French, German, and Spanish. It introduces a multilingual corpus extended to FR, DE, and ES with 100 human-generated, 100 AI-generated, and 100 AI-rephrased articles per language across 10 topics, and evaluates 37 features across 8 categories using XGBoost, Random Forest, and MLP classifiers, with GPTZero and ZeroGPT as baselines. The results show that for AI-generated detection, a combined All-features approach yields high and portable performance across languages (ES 99%, EN 98%, DE 97%, FR 95%), while AI-rephrased detection is more language-dependent, with document features excelling in DE and ES and text-vector features excelling in EN. The study provides a publicly available multilingual corpus and demonstrates that a portable feature set can enable robust detection across related languages, suggesting broader applicability and informing future work on cross-language robustness and transformer-based enhancements.

Abstract

In this paper we analyze features to classify human- and AI-generated text for English, French, German and Spanish and compare them across languages. We investigate two scenarios: (1) The detection of text generated by AI from scratch, and (2) the detection of text rephrased by AI. For training and testing the classifiers in this multilingual setting, we created a new text corpus covering 10 topics for each language. For the detection of AI-generated text, the combination of all proposed features performs best, indicating that our features are portable to other related languages: The F1-scores are close with 99% for Spanish, 98% for English, 97% for German and 95% for French. For the detection of AI-rephrased text, the systems with all features outperform systems with other features in many cases, but using only document features performs best for German (72%) and Spanish (86%) and only text vector features leads to best results for English (78%).

Classification of Human- and AI-Generated Texts for English, French, German, and Spanish

TL;DR

The paper tackles multilingual detection of AI-generated text by distinguishing human-written from AI-generated text (from scratch) and AI-rephrased text across English, French, German, and Spanish. It introduces a multilingual corpus extended to FR, DE, and ES with 100 human-generated, 100 AI-generated, and 100 AI-rephrased articles per language across 10 topics, and evaluates 37 features across 8 categories using XGBoost, Random Forest, and MLP classifiers, with GPTZero and ZeroGPT as baselines. The results show that for AI-generated detection, a combined All-features approach yields high and portable performance across languages (ES 99%, EN 98%, DE 97%, FR 95%), while AI-rephrased detection is more language-dependent, with document features excelling in DE and ES and text-vector features excelling in EN. The study provides a publicly available multilingual corpus and demonstrates that a portable feature set can enable robust detection across related languages, suggesting broader applicability and informing future work on cross-language robustness and transformer-based enhancements.

Abstract

In this paper we analyze features to classify human- and AI-generated text for English, French, German and Spanish and compare them across languages. We investigate two scenarios: (1) The detection of text generated by AI from scratch, and (2) the detection of text rephrased by AI. For training and testing the classifiers in this multilingual setting, we created a new text corpus covering 10 topics for each language. For the detection of AI-generated text, the combination of all proposed features performs best, indicating that our features are portable to other related languages: The F1-scores are close with 99% for Spanish, 98% for English, 97% for German and 95% for French. For the detection of AI-rephrased text, the systems with all features outperform systems with other features in many cases, but using only document features performs best for German (72%) and Spanish (86%) and only text vector features leads to best results for English (78%).
Paper Structure (32 sections, 1 figure, 5 tables)