Table of Contents
Fetching ...

Stylometric Detection of AI-Generated Text in Twitter Timelines

Tharindu Kumarage, Joshua Garland, Amrita Bhattacharjee, Kirill Trapeznikov, Scott Ruston, Huan Liu

TL;DR

This work addresses the risk of AI-generated misinformation in Twitter timelines by introducing stylometric signals as an auxiliary cue to detect AI-authored tweets and to locate the point of human-to-AI author change. It proposes a fusion model that combines stylometric features with RoBERTa embeddings for detection and a StyloCPA change-point framework for localization, evaluated on an in-house dataset and TweepFake. Results show that stylometry improves detection performance for short timelines and enables effective change localization under limited training data, often outperforming PLM-only baselines. The approach provides a data-efficient, explainable pathway for detecting AI-generated content and for digital forensics on social platforms.

Abstract

Recent advancements in pre-trained language models have enabled convenient methods for generating human-like text at a large scale. Though these generation capabilities hold great potential for breakthrough applications, it can also be a tool for an adversary to generate misinformation. In particular, social media platforms like Twitter are highly susceptible to AI-generated misinformation. A potential threat scenario is when an adversary hijacks a credible user account and incorporates a natural language generator to generate misinformation. Such threats necessitate automated detectors for AI-generated tweets in a given user's Twitter timeline. However, tweets are inherently short, thus making it difficult for current state-of-the-art pre-trained language model-based detectors to accurately detect at what point the AI starts to generate tweets in a given Twitter timeline. In this paper, we present a novel algorithm using stylometric signals to aid detecting AI-generated tweets. We propose models corresponding to quantifying stylistic changes in human and AI tweets in two related tasks: Task 1 - discriminate between human and AI-generated tweets, and Task 2 - detect if and when an AI starts to generate tweets in a given Twitter timeline. Our extensive experiments demonstrate that the stylometric features are effective in augmenting the state-of-the-art AI-generated text detectors.

Stylometric Detection of AI-Generated Text in Twitter Timelines

TL;DR

This work addresses the risk of AI-generated misinformation in Twitter timelines by introducing stylometric signals as an auxiliary cue to detect AI-authored tweets and to locate the point of human-to-AI author change. It proposes a fusion model that combines stylometric features with RoBERTa embeddings for detection and a StyloCPA change-point framework for localization, evaluated on an in-house dataset and TweepFake. Results show that stylometry improves detection performance for short timelines and enables effective change localization under limited training data, often outperforming PLM-only baselines. The approach provides a data-efficient, explainable pathway for detecting AI-generated content and for digital forensics on social platforms.

Abstract

Recent advancements in pre-trained language models have enabled convenient methods for generating human-like text at a large scale. Though these generation capabilities hold great potential for breakthrough applications, it can also be a tool for an adversary to generate misinformation. In particular, social media platforms like Twitter are highly susceptible to AI-generated misinformation. A potential threat scenario is when an adversary hijacks a credible user account and incorporates a natural language generator to generate misinformation. Such threats necessitate automated detectors for AI-generated tweets in a given user's Twitter timeline. However, tweets are inherently short, thus making it difficult for current state-of-the-art pre-trained language model-based detectors to accurately detect at what point the AI starts to generate tweets in a given Twitter timeline. In this paper, we present a novel algorithm using stylometric signals to aid detecting AI-generated tweets. We propose models corresponding to quantifying stylistic changes in human and AI tweets in two related tasks: Task 1 - discriminate between human and AI-generated tweets, and Task 2 - detect if and when an AI starts to generate tweets in a given Twitter timeline. Our extensive experiments demonstrate that the stylometric features are effective in augmenting the state-of-the-art AI-generated text detectors.
Paper Structure (15 sections, 4 figures, 3 tables)

This paper contains 15 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: An hypothetical example where a credible news Twitter account gets hijacked and generates misinformation.
  • Figure 2: Proposed stylometry-based architectures
  • Figure 3: Accuracy in detecting mixed timelines as a function of training set size.
  • Figure 4: Further analysis on different variations in data and features; (a) Performance on different topics, (b) Performance vs. generator size (number of parameters), and (c) Features importance in each stylometry feature category.