Table of Contents
Fetching ...

Tracing Thought: Using Chain-of-Thought Reasoning to Identify the LLM Behind AI-Generated Text

Shifali Agrahari, Sanasam Ranbir Singh

TL;DR

The paper tackles the problem of detecting AI-generated text and attributing it to a specific LLM. It proposes COT_Finetuned, a dual-task framework that uses Chain-of-Thought reasoning to generate explanations while jointly predicting AI-vs-Human and LLM identity. Experiments show that CoT enhances performance, with Bert+COT achieving strong Task A F1 and reasonable Task B attribution, underscoring the value of interpretability. The approach has practical implications for academic integrity and content moderation, providing transparent, model-aware detection of AI-generated content.

Abstract

In recent years, the detection of AI-generated text has become a critical area of research due to concerns about academic integrity, misinformation, and ethical AI deployment. This paper presents COT Fine-tuned, a novel framework for detecting AI-generated text and identifying the specific language model. responsible for generating the text. We propose a dual-task approach, where Task A involves classifying text as AI-generated or human-written, and Task B identifies the specific LLM behind the text. The key innovation of our method lies in the use of Chain-of-Thought reasoning, which enables the model to generate explanations for its predictions, enhancing transparency and interpretability. Our experiments demonstrate that COT Fine-tuned achieves high accuracy in both tasks, with strong performance in LLM identification and human-AI classification. We also show that the CoT reasoning process contributes significantly to the models effectiveness and interpretability.

Tracing Thought: Using Chain-of-Thought Reasoning to Identify the LLM Behind AI-Generated Text

TL;DR

The paper tackles the problem of detecting AI-generated text and attributing it to a specific LLM. It proposes COT_Finetuned, a dual-task framework that uses Chain-of-Thought reasoning to generate explanations while jointly predicting AI-vs-Human and LLM identity. Experiments show that CoT enhances performance, with Bert+COT achieving strong Task A F1 and reasonable Task B attribution, underscoring the value of interpretability. The approach has practical implications for academic integrity and content moderation, providing transparent, model-aware detection of AI-generated content.

Abstract

In recent years, the detection of AI-generated text has become a critical area of research due to concerns about academic integrity, misinformation, and ethical AI deployment. This paper presents COT Fine-tuned, a novel framework for detecting AI-generated text and identifying the specific language model. responsible for generating the text. We propose a dual-task approach, where Task A involves classifying text as AI-generated or human-written, and Task B identifies the specific LLM behind the text. The key innovation of our method lies in the use of Chain-of-Thought reasoning, which enables the model to generate explanations for its predictions, enhancing transparency and interpretability. Our experiments demonstrate that COT Fine-tuned achieves high accuracy in both tasks, with strong performance in LLM identification and human-AI classification. We also show that the CoT reasoning process contributes significantly to the models effectiveness and interpretability.

Paper Structure

This paper contains 17 sections, 1 equation, 1 figure, 6 tables.

Figures (1)

  • Figure 1: Proposed detector model for binary classification task A & multi classification task B.