Table of Contents
Fetching ...

AI-generated Text Detection with a GLTR-based Approach

Lucía Yan Wu, Isabel Segura-Bedmar

TL;DR

The paper investigates AI-generated text detection by leveraging GLTR's word-prediction signals within a GPT-2 based classifier, applied to English and Spanish data from the AuTexTification 2023 task. It assesses thresholding on the proportion of green (highly predicted) words per sentence, with $2/3$ emerging as the best balance, and evaluates multiple GPT-2 variants for English and Spanish. Results show macro $F_1$ scores of $80.19\%$ for English (slightly below the top system at $80.91\%$) and up to $66.20\%$ for Spanish (using GPT-2 XL), indicating competitive performance relative to state-of-the-art baselines. The work provides a reproducible pipeline, threshold analysis, and public code/demo, and proposes future directions for fine-tuning Spanish models and extending to multilingual and multimodal detection tasks.

Abstract

The rise of LLMs (Large Language Models) has contributed to the improved performance and development of cutting-edge NLP applications. However, these can also pose risks when used maliciously, such as spreading fake news, harmful content, impersonating individuals, or facilitating school plagiarism, among others. This is because LLMs can generate high-quality texts, which are challenging to differentiate from those written by humans. GLTR, which stands for Giant Language Model Test Room and was developed jointly by the MIT-IBM Watson AI Lab and HarvardNLP, is a visual tool designed to help detect machine-generated texts based on GPT-2, that highlights the words in text depending on the probability that they were machine-generated. One limitation of GLTR is that the results it returns can sometimes be ambiguous and lead to confusion. This study aims to explore various ways to improve GLTR's effectiveness for detecting AI-generated texts within the context of the IberLef-AuTexTification 2023 shared task, in both English and Spanish languages. Experiment results show that our GLTR-based GPT-2 model overcomes the state-of-the-art models on the English dataset with a macro F1-score of 80.19%, except for the first ranking model (80.91%). However, for the Spanish dataset, we obtained a macro F1-score of 66.20%, which differs by 4.57% compared to the top-performing model.

AI-generated Text Detection with a GLTR-based Approach

TL;DR

The paper investigates AI-generated text detection by leveraging GLTR's word-prediction signals within a GPT-2 based classifier, applied to English and Spanish data from the AuTexTification 2023 task. It assesses thresholding on the proportion of green (highly predicted) words per sentence, with emerging as the best balance, and evaluates multiple GPT-2 variants for English and Spanish. Results show macro scores of for English (slightly below the top system at ) and up to for Spanish (using GPT-2 XL), indicating competitive performance relative to state-of-the-art baselines. The work provides a reproducible pipeline, threshold analysis, and public code/demo, and proposes future directions for fine-tuning Spanish models and extending to multilingual and multimodal detection tasks.

Abstract

The rise of LLMs (Large Language Models) has contributed to the improved performance and development of cutting-edge NLP applications. However, these can also pose risks when used maliciously, such as spreading fake news, harmful content, impersonating individuals, or facilitating school plagiarism, among others. This is because LLMs can generate high-quality texts, which are challenging to differentiate from those written by humans. GLTR, which stands for Giant Language Model Test Room and was developed jointly by the MIT-IBM Watson AI Lab and HarvardNLP, is a visual tool designed to help detect machine-generated texts based on GPT-2, that highlights the words in text depending on the probability that they were machine-generated. One limitation of GLTR is that the results it returns can sometimes be ambiguous and lead to confusion. This study aims to explore various ways to improve GLTR's effectiveness for detecting AI-generated texts within the context of the IberLef-AuTexTification 2023 shared task, in both English and Spanish languages. Experiment results show that our GLTR-based GPT-2 model overcomes the state-of-the-art models on the English dataset with a macro F1-score of 80.19%, except for the first ranking model (80.91%). However, for the Spanish dataset, we obtained a macro F1-score of 66.20%, which differs by 4.57% compared to the top-performing model.

Paper Structure

This paper contains 11 sections, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Number of samples per dataset, class, and domain (subtask 1 for English).
  • Figure 2: Number of samples per dataset, class, and domain (subtask 1 for Spanish).
  • Figure 3: GLTR example DBLP:journals/corr/abs-1906-04043.
  • Figure 4: GPT-2 Small Confusion Matrix (0: generated, 1: human).
  • Figure 5: GPT-2 XL Confusion Matrix (0: generated, 1: human).