Table of Contents
Fetching ...

Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text

Yafu Li, Zhilin Wang, Leyang Cui, Wei Bi, Shuming Shi, Yue Zhang

TL;DR

This work tackles fine-grained detection of AI paraphrasing by introducing Paraphrased Text Span Detection (PTD), which assigns paraphrase scores to individual sentences within a long text. A dedicated dataset, PASTED, enables training and evaluating ID and out-of-distribution (OOD) generalization across context-agnostic and context-aware paraphrases, with strong ID AUROC (>0.95) and robust OOD performance. The study finds that sentence-level regression better captures paraphrase degree and generalizes across prompts and multiple paraphrase spans, while context surrounding the paraphrase is crucial for detection. A two-stage defense is demonstrated to defend AI-generation detectors against paraphrasing attacks, and resources are released to support future research. The work advances practical, fine-grained AI-detection capabilities with implications for education, ethics, and content integrity.

Abstract

AI-generated text detection has attracted increasing attention as powerful language models approach human-level generation. Limited work is devoted to detecting (partially) AI-paraphrased texts. However, AI paraphrasing is commonly employed in various application scenarios for text refinement and diversity. To this end, we propose a novel detection framework, paraphrased text span detection (PTD), aiming to identify paraphrased text spans within a text. Different from text-level detection, PTD takes in the full text and assigns each of the sentences with a score indicating the paraphrasing degree. We construct a dedicated dataset, PASTED, for paraphrased text span detection. Both in-distribution and out-of-distribution results demonstrate the effectiveness of PTD models in identifying AI-paraphrased text spans. Statistical and model analysis explains the crucial role of the surrounding context of the paraphrased text spans. Extensive experiments show that PTD models can generalize to versatile paraphrasing prompts and multiple paraphrased text spans. We release our resources at https://github.com/Linzwcs/PASTED.

Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text

TL;DR

This work tackles fine-grained detection of AI paraphrasing by introducing Paraphrased Text Span Detection (PTD), which assigns paraphrase scores to individual sentences within a long text. A dedicated dataset, PASTED, enables training and evaluating ID and out-of-distribution (OOD) generalization across context-agnostic and context-aware paraphrases, with strong ID AUROC (>0.95) and robust OOD performance. The study finds that sentence-level regression better captures paraphrase degree and generalizes across prompts and multiple paraphrase spans, while context surrounding the paraphrase is crucial for detection. A two-stage defense is demonstrated to defend AI-generation detectors against paraphrasing attacks, and resources are released to support future research. The work advances practical, fine-grained AI-detection capabilities with implications for education, ethics, and content integrity.

Abstract

AI-generated text detection has attracted increasing attention as powerful language models approach human-level generation. Limited work is devoted to detecting (partially) AI-paraphrased texts. However, AI paraphrasing is commonly employed in various application scenarios for text refinement and diversity. To this end, we propose a novel detection framework, paraphrased text span detection (PTD), aiming to identify paraphrased text spans within a text. Different from text-level detection, PTD takes in the full text and assigns each of the sentences with a score indicating the paraphrasing degree. We construct a dedicated dataset, PASTED, for paraphrased text span detection. Both in-distribution and out-of-distribution results demonstrate the effectiveness of PTD models in identifying AI-paraphrased text spans. Statistical and model analysis explains the crucial role of the surrounding context of the paraphrased text spans. Extensive experiments show that PTD models can generalize to versatile paraphrasing prompts and multiple paraphrased text spans. We release our resources at https://github.com/Linzwcs/PASTED.
Paper Structure (37 sections, 3 equations, 15 figures, 8 tables, 1 algorithm)

This paper contains 37 sections, 3 equations, 15 figures, 8 tables, 1 algorithm.

Figures (15)

  • Figure 1: A comparison between AI-generated text detection and paraphrased text span detection, which identifies paraphrased text spans with paraphrasing degree informed, i.e., darker colors denote larger differences between the original and paraphrased text spans.
  • Figure 2: System pipeline of paraphrased text span detection.
  • Figure 3: Perplexity distribution of the complete texts and the exact text spans (original v.s. paraphrase).
  • Figure 4: Detection performance (lexical regression) on text from different domains and LLMs.
  • Figure 5: Detection performance (lexical regression) w.r.t. the number of paraphrased sentences in a text.
  • ...and 10 more figures