Table of Contents
Fetching ...

Enhancing Authorship Attribution through Embedding Fusion: A Novel Approach with Masked and Encoder-Decoder Language Models

Arjun Ramesh Kaushik, Sunil Rufus R P, Nalini Ratha

TL;DR

This work proposes a novel framework with textual embeddings from Pre-trained Language Models (PLMs) to distinguish AI-generated and human-authored text, utilizing Embedding Fusion to integrate semantic information from multiple Language Models, harnessing their complementary strengths to enhance performance.

Abstract

The increasing prevalence of AI-generated content alongside human-written text underscores the need for reliable discrimination methods. To address this challenge, we propose a novel framework with textual embeddings from Pre-trained Language Models (PLMs) to distinguish AI-generated and human-authored text. Our approach utilizes Embedding Fusion to integrate semantic information from multiple Language Models, harnessing their complementary strengths to enhance performance. Through extensive evaluation across publicly available diverse datasets, our proposed approach demonstrates strong performance, achieving classification accuracy greater than 96% and a Matthews Correlation Coefficient (MCC) greater than 0.93. This evaluation is conducted on a balanced dataset of texts generated from five well-known Large Language Models (LLMs), highlighting the effectiveness and robustness of our novel methodology.

Enhancing Authorship Attribution through Embedding Fusion: A Novel Approach with Masked and Encoder-Decoder Language Models

TL;DR

This work proposes a novel framework with textual embeddings from Pre-trained Language Models (PLMs) to distinguish AI-generated and human-authored text, utilizing Embedding Fusion to integrate semantic information from multiple Language Models, harnessing their complementary strengths to enhance performance.

Abstract

The increasing prevalence of AI-generated content alongside human-written text underscores the need for reliable discrimination methods. To address this challenge, we propose a novel framework with textual embeddings from Pre-trained Language Models (PLMs) to distinguish AI-generated and human-authored text. Our approach utilizes Embedding Fusion to integrate semantic information from multiple Language Models, harnessing their complementary strengths to enhance performance. Through extensive evaluation across publicly available diverse datasets, our proposed approach demonstrates strong performance, achieving classification accuracy greater than 96% and a Matthews Correlation Coefficient (MCC) greater than 0.93. This evaluation is conducted on a balanced dataset of texts generated from five well-known Large Language Models (LLMs), highlighting the effectiveness and robustness of our novel methodology.

Paper Structure

This paper contains 9 sections, 1 equation, 5 figures, 4 tables.

Figures (5)

  • Figure 1: An overview of our framework to perform authorship attribution between machine-generated and human-authored texts.
  • Figure 2: Architecture of the classification model used in our framework.
  • Figure 3: Autoregressive Language Models use the provided context to generate relevant responses. Here, we assume that the token 'C' captures the context of the query.
  • Figure 4: Masked Language Models are trained to predict masked tokens (represented as [MASK]) in a query, which helps the models better understand semantic context.
  • Figure 5: Encoder-Decoder Language Models transform an input sequence into a fixed-sized embedding. These embeddings are used by the decoder to generate an output sequence. Such models are often used in machine translation and sequence-to-sequence prediction.