Table of Contents
Fetching ...

GenAI Content Detection Task 3: Cross-Domain Machine-Generated Text Detection Challenge

Liam Dugan, Andrew Zhu, Firoj Alam, Preslav Nakov, Marianna Apidianaki, Chris Callison-Burch

TL;DR

This work investigates cross-domain detection of machine-generated text by evaluating detectors on a large, fixed set of domains and LLMs using the RAID benchmark. It introduces two subtasks focusing on cross-domain accuracy and adversarial robustness, with a range of baselines and 9 participating teams employing diverse modeling and preprocessing strategies. The results show near-perfect performance on in-domain detection (up to ~99% TPR at a 5% FPR) and strong robustness against many attacks, though certain adversarial strategies like Homoglyph, Paraphrase, and Synonym remain challenging. The findings underscore the feasibility of robust detectors within a constrained domain space and highlight important benchmarking considerations, including data leakage risks and confounding factors in real-world deployment.

Abstract

Recently there have been many shared tasks targeting the detection of generated text from Large Language Models (LLMs). However, these shared tasks tend to focus either on cases where text is limited to one particular domain or cases where text can be from many domains, some of which may not be seen during test time. In this shared task, using the newly released RAID benchmark, we aim to answer whether or not models can detect generated text from a large, yet fixed, number of domains and LLMs, all of which are seen during training. Over the course of three months, our task was attempted by 9 teams with 23 detector submissions. We find that multiple participants were able to obtain accuracies of over 99% on machine-generated text from RAID while maintaining a 5% False Positive Rate -- suggesting that detectors are able to robustly detect text from many domains and models simultaneously. We discuss potential interpretations of this result and provide directions for future research.

GenAI Content Detection Task 3: Cross-Domain Machine-Generated Text Detection Challenge

TL;DR

This work investigates cross-domain detection of machine-generated text by evaluating detectors on a large, fixed set of domains and LLMs using the RAID benchmark. It introduces two subtasks focusing on cross-domain accuracy and adversarial robustness, with a range of baselines and 9 participating teams employing diverse modeling and preprocessing strategies. The results show near-perfect performance on in-domain detection (up to ~99% TPR at a 5% FPR) and strong robustness against many attacks, though certain adversarial strategies like Homoglyph, Paraphrase, and Synonym remain challenging. The findings underscore the feasibility of robust detectors within a constrained domain space and highlight important benchmarking considerations, including data leakage risks and confounding factors in real-world deployment.

Abstract

Recently there have been many shared tasks targeting the detection of generated text from Large Language Models (LLMs). However, these shared tasks tend to focus either on cases where text is limited to one particular domain or cases where text can be from many domains, some of which may not be seen during test time. In this shared task, using the newly released RAID benchmark, we aim to answer whether or not models can detect generated text from a large, yet fixed, number of domains and LLMs, all of which are seen during training. Over the course of three months, our task was attempted by 9 teams with 23 detector submissions. We find that multiple participants were able to obtain accuracies of over 99% on machine-generated text from RAID while maintaining a 5% False Positive Rate -- suggesting that detectors are able to robustly detect text from many domains and models simultaneously. We discuss potential interpretations of this result and provide directions for future research.
Paper Structure (37 sections, 10 tables)