Table of Contents
Fetching ...

A Survey of AI-generated Text Forensic Systems: Detection, Attribution, and Characterization

Tharindu Kumarage, Garima Agrawal, Paras Sheth, Raha Moraffah, Aman Chadha, Joshua Garland, Huan Liu

TL;DR

An overview of the existing efforts in AI-generated text forensics is presented by introducing a detailed taxonomy, focusing on three primary pillars: detection, attribution, and characterization, which enable a practical understanding of AI-generated text.

Abstract

We have witnessed lately a rapid proliferation of advanced Large Language Models (LLMs) capable of generating high-quality text. While these LLMs have revolutionized text generation across various domains, they also pose significant risks to the information ecosystem, such as the potential for generating convincing propaganda, misinformation, and disinformation at scale. This paper offers a review of AI-generated text forensic systems, an emerging field addressing the challenges of LLM misuses. We present an overview of the existing efforts in AI-generated text forensics by introducing a detailed taxonomy, focusing on three primary pillars: detection, attribution, and characterization. These pillars enable a practical understanding of AI-generated text, from identifying AI-generated content (detection), determining the specific AI model involved (attribution), and grouping the underlying intents of the text (characterization). Furthermore, we explore available resources for AI-generated text forensics research and discuss the evolving challenges and future directions of forensic systems in an AI era.

A Survey of AI-generated Text Forensic Systems: Detection, Attribution, and Characterization

TL;DR

An overview of the existing efforts in AI-generated text forensics is presented by introducing a detailed taxonomy, focusing on three primary pillars: detection, attribution, and characterization, which enable a practical understanding of AI-generated text.

Abstract

We have witnessed lately a rapid proliferation of advanced Large Language Models (LLMs) capable of generating high-quality text. While these LLMs have revolutionized text generation across various domains, they also pose significant risks to the information ecosystem, such as the potential for generating convincing propaganda, misinformation, and disinformation at scale. This paper offers a review of AI-generated text forensic systems, an emerging field addressing the challenges of LLM misuses. We present an overview of the existing efforts in AI-generated text forensics by introducing a detailed taxonomy, focusing on three primary pillars: detection, attribution, and characterization. These pillars enable a practical understanding of AI-generated text, from identifying AI-generated content (detection), determining the specific AI model involved (attribution), and grouping the underlying intents of the text (characterization). Furthermore, we explore available resources for AI-generated text forensics research and discuss the evolving challenges and future directions of forensic systems in an AI era.
Paper Structure (30 sections, 2 figures, 4 tables)

This paper contains 30 sections, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Primary pillars of AI-generated text forensics: (i) detection, (ii) attribution, and (iii) characterization, where each pillar provides an increasingly nuanced understanding of AI-generated text.
  • Figure 2: Taxonomy of AI-generated text Forensic Systems.