Stylometric Detection of AI-Generated Text in Twitter Timelines
Tharindu Kumarage, Joshua Garland, Amrita Bhattacharjee, Kirill Trapeznikov, Scott Ruston, Huan Liu
TL;DR
This work addresses the risk of AI-generated misinformation in Twitter timelines by introducing stylometric signals as an auxiliary cue to detect AI-authored tweets and to locate the point of human-to-AI author change. It proposes a fusion model that combines stylometric features with RoBERTa embeddings for detection and a StyloCPA change-point framework for localization, evaluated on an in-house dataset and TweepFake. Results show that stylometry improves detection performance for short timelines and enables effective change localization under limited training data, often outperforming PLM-only baselines. The approach provides a data-efficient, explainable pathway for detecting AI-generated content and for digital forensics on social platforms.
Abstract
Recent advancements in pre-trained language models have enabled convenient methods for generating human-like text at a large scale. Though these generation capabilities hold great potential for breakthrough applications, it can also be a tool for an adversary to generate misinformation. In particular, social media platforms like Twitter are highly susceptible to AI-generated misinformation. A potential threat scenario is when an adversary hijacks a credible user account and incorporates a natural language generator to generate misinformation. Such threats necessitate automated detectors for AI-generated tweets in a given user's Twitter timeline. However, tweets are inherently short, thus making it difficult for current state-of-the-art pre-trained language model-based detectors to accurately detect at what point the AI starts to generate tweets in a given Twitter timeline. In this paper, we present a novel algorithm using stylometric signals to aid detecting AI-generated tweets. We propose models corresponding to quantifying stylistic changes in human and AI tweets in two related tasks: Task 1 - discriminate between human and AI-generated tweets, and Task 2 - detect if and when an AI starts to generate tweets in a given Twitter timeline. Our extensive experiments demonstrate that the stylometric features are effective in augmenting the state-of-the-art AI-generated text detectors.
