Table of Contents
Fetching ...

Overview of AuTexTification at IberLEF 2023: Detection and Attribution of Machine-Generated Text in Multiple Domains

Areg Mikael Sarvazyan, José Ángel González, Marc Franco-Salvador, Francisco Rangel, Berta Chulvi, Paolo Rosso

TL;DR

AuTexTification addresses the problem of detecting machine-generated text and attributing it to specific text generation systems across multiple domains and languages. The paper describes a large bilingual dataset (English and Spanish) spanning five domains, built via a prefix continuation data gathering approach, and evaluates two tasks: detection and attribution, using a range of baselines and Transformer-based systems. Key findings show that cross-domain detection is more tractable in English than Spanish, while attribution remains challenging, with ensembles of Transformers offering top performance. The work provides a practical, extensible resource for forensic natural language processing and motivates further exploration of generalization and fine-grained attribution across more languages and domains.

Abstract

This paper presents the overview of the AuTexTification shared task as part of the IberLEF 2023 Workshop in Iberian Languages Evaluation Forum, within the framework of the SEPLN 2023 conference. AuTexTification consists of two subtasks: for Subtask 1, participants had to determine whether a text is human-authored or has been generated by a large language model. For Subtask 2, participants had to attribute a machine-generated text to one of six different text generation models. Our AuTexTification 2023 dataset contains more than 160.000 texts across two languages (English and Spanish) and five domains (tweets, reviews, news, legal, and how-to articles). A total of 114 teams signed up to participate, of which 36 sent 175 runs, and 20 of them sent their working notes. In this overview, we present the AuTexTification dataset and task, the submitted participating systems, and the results.

Overview of AuTexTification at IberLEF 2023: Detection and Attribution of Machine-Generated Text in Multiple Domains

TL;DR

AuTexTification addresses the problem of detecting machine-generated text and attributing it to specific text generation systems across multiple domains and languages. The paper describes a large bilingual dataset (English and Spanish) spanning five domains, built via a prefix continuation data gathering approach, and evaluates two tasks: detection and attribution, using a range of baselines and Transformer-based systems. Key findings show that cross-domain detection is more tractable in English than Spanish, while attribution remains challenging, with ensembles of Transformers offering top performance. The work provides a practical, extensible resource for forensic natural language processing and motivates further exploration of generalization and fine-grained attribution across more languages and domains.

Abstract

This paper presents the overview of the AuTexTification shared task as part of the IberLEF 2023 Workshop in Iberian Languages Evaluation Forum, within the framework of the SEPLN 2023 conference. AuTexTification consists of two subtasks: for Subtask 1, participants had to determine whether a text is human-authored or has been generated by a large language model. For Subtask 2, participants had to attribute a machine-generated text to one of six different text generation models. Our AuTexTification 2023 dataset contains more than 160.000 texts across two languages (English and Spanish) and five domains (tweets, reviews, news, legal, and how-to articles). A total of 114 teams signed up to participate, of which 36 sent 175 runs, and 20 of them sent their working notes. In this overview, we present the AuTexTification dataset and task, the submitted participating systems, and the results.
Paper Structure (19 sections, 6 figures, 7 tables)

This paper contains 19 sections, 6 figures, 7 tables.

Figures (6)

  • Figure 1: Data gathering process.
  • Figure 2: Human performance in English (top) and Spanish (bottom). The grey dotted line is the random baseline.
  • Figure 3: Rank-ordered Macro-F$_1$ with error bars for Subtask 1 in English (top) and Spanish (bottom). Colored lines are baselines.
  • Figure 4: Fine-grained plots for Subtask 1 in English (top) and Spanish (bottom).
  • Figure 5: Rank-ordered Macro-F$_1$ for Subtask 2 in English (top) and Spanish (bottom). Colored lines are baselines.
  • ...and 1 more figures