Table of Contents
Fetching ...

LLM-Detector: Improving AI-Generated Chinese Text Detection with Open-Source LLM Instruction Tuning

Rongsheng Wang, Haoming Chen, Ruizhe Zhou, Han Ma, Yaofei Duan, Yanlan Kang, Songhua Yang, Baoyu Fan, Tao Tan

TL;DR

This work tackles the challenge of detecting AI-generated Chinese text, addressing poor generalization and sentence-level detection in prior detectors. It introduces LLM-Detector, which uses instruction tuning to align an open-source LLM with a text-detection task, enabling both document-level and sentence-level identification of AI-generated content. A large, multi-domain dataset is created by combining HC3 and M4 responses from human experts and nine LLMs, with careful labeling and a split that supports in-domain and out-of-domain evaluation. Results show state-of-the-art performance on in-domain and strong generalization to OOD data, along with robust sentence-level detection, highlighting practical deployment advantages due to the use of open-source LLMs and detailed, fine-grained data.

Abstract

ChatGPT and other general large language models (LLMs) have achieved remarkable success, but they have also raised concerns about the misuse of AI-generated texts. Existing AI-generated text detection models, such as based on BERT and RoBERTa, are prone to in-domain over-fitting, leading to poor out-of-domain (OOD) detection performance. In this paper, we first collected Chinese text responses generated by human experts and 9 types of LLMs, for which to multiple domains questions, and further created a dataset that mixed human-written sentences and sentences polished by LLMs. We then proposed LLM-Detector, a novel method for both document-level and sentence-level text detection through Instruction Tuning of LLMs. Our method leverages the wealth of knowledge LLMs acquire during pre-training, enabling them to detect the text they generate. Instruction tuning aligns the model's responses with the user's expected text detection tasks. Experimental results show that previous methods struggle with sentence-level AI-generated text detection and OOD detection. In contrast, our proposed method not only significantly outperforms baseline methods in both sentence-level and document-level text detection but also demonstrates strong generalization capabilities. Furthermore, since LLM-Detector is trained based on open-source LLMs, it is easy to customize for deployment.

LLM-Detector: Improving AI-Generated Chinese Text Detection with Open-Source LLM Instruction Tuning

TL;DR

This work tackles the challenge of detecting AI-generated Chinese text, addressing poor generalization and sentence-level detection in prior detectors. It introduces LLM-Detector, which uses instruction tuning to align an open-source LLM with a text-detection task, enabling both document-level and sentence-level identification of AI-generated content. A large, multi-domain dataset is created by combining HC3 and M4 responses from human experts and nine LLMs, with careful labeling and a split that supports in-domain and out-of-domain evaluation. Results show state-of-the-art performance on in-domain and strong generalization to OOD data, along with robust sentence-level detection, highlighting practical deployment advantages due to the use of open-source LLMs and detailed, fine-grained data.

Abstract

ChatGPT and other general large language models (LLMs) have achieved remarkable success, but they have also raised concerns about the misuse of AI-generated texts. Existing AI-generated text detection models, such as based on BERT and RoBERTa, are prone to in-domain over-fitting, leading to poor out-of-domain (OOD) detection performance. In this paper, we first collected Chinese text responses generated by human experts and 9 types of LLMs, for which to multiple domains questions, and further created a dataset that mixed human-written sentences and sentences polished by LLMs. We then proposed LLM-Detector, a novel method for both document-level and sentence-level text detection through Instruction Tuning of LLMs. Our method leverages the wealth of knowledge LLMs acquire during pre-training, enabling them to detect the text they generate. Instruction tuning aligns the model's responses with the user's expected text detection tasks. Experimental results show that previous methods struggle with sentence-level AI-generated text detection and OOD detection. In contrast, our proposed method not only significantly outperforms baseline methods in both sentence-level and document-level text detection but also demonstrates strong generalization capabilities. Furthermore, since LLM-Detector is trained based on open-source LLMs, it is easy to customize for deployment.
Paper Structure (35 sections, 4 equations, 7 figures, 13 tables)

This paper contains 35 sections, 4 equations, 7 figures, 13 tables.

Figures (7)

  • Figure 1: LLM-Detector Framework. First, HC3 and M4 seed questions are used to prompt responses from human experts and multiple LLMs, where the responses to HC3's multi-domain seed questions will be employed to train the LLM-Detector. Second, the responses generated using M4 seed questions from the same domain as HC3 are utilized to test the in-domain capabilities of the LLM-Detector, while an additionally constructed News dataset is used to test the LLM-Detector's OOD capabilities.
  • Figure 2: The accuracy performance increases with the proportion of AI-generated content.
  • Figure 3: Dataset Source Parse Analysis. The source of the response data in the training set, in-domain test set, and OOD test set.
  • Figure 4: Part-of-Speech Comparison on Train Set.
  • Figure 5: Part-of-Speech Comparison on Test Set.
  • ...and 2 more figures