M-DAIGT: A Shared Task on Multi-Domain Detection of AI-Generated Text
Salima Lamsiyah, Saad Ezzini, Abdelkader El Mahdaouy, Hamza Alami, Abdessamad Benlahbib, Samir El Amrany, Salmane Chafik, Hicham Hammouchi
TL;DR
The paper introduces the Multi-Domain Detection of AI-Generated Text (M-DAIGT) shared task to detect AI-generated text in two high-stakes domains: news articles and academic writing. It releases a 30,000-sample benchmark comprising human- and AI-written content produced by multiple LLMs with varied prompting strategies, enabling cross-domain evaluation. Baselines and four participating systems—dominated by transformer models and augmented by stylometric features—achieve near-perfect to perfect F1 scores on both subtasks, underscoring the current strength of state-of-the-art detectors. The work highlights the practical impact of robust AI-content detection while acknowledging limitations such as dataset staticity, binary authorship framing, and language scope, and it points to future directions in adversarial testing, mixed authorship, and multilingual expansion.
Abstract
The generation of highly fluent text by Large Language Models (LLMs) poses a significant challenge to information integrity and academic research. In this paper, we introduce the Multi-Domain Detection of AI-Generated Text (M-DAIGT) shared task, which focuses on detecting AI-generated text across multiple domains, particularly in news articles and academic writing. M-DAIGT comprises two binary classification subtasks: News Article Detection (NAD) (Subtask 1) and Academic Writing Detection (AWD) (Subtask 2). To support this task, we developed and released a new large-scale benchmark dataset of 30,000 samples, balanced between human-written and AI-generated texts. The AI-generated content was produced using a variety of modern LLMs (e.g., GPT-4, Claude) and diverse prompting strategies. A total of 46 unique teams registered for the shared task, of which four teams submitted final results. All four teams participated in both Subtask 1 and Subtask 2. We describe the methods employed by these participating teams and briefly discuss future directions for M-DAIGT.
