Deepfake tweets automatic detection
Adam Frej, Adrian Kaminski, Piotr Marciniak, Szymon Szmajdzinski, Soveatin Kuntur, Anna Wroblewska
TL;DR
The paper tackles the problem of detecting DeepFake tweets using NLP to counter misinformation on social media. It evaluates a broad set of text representations, preprocessing steps, and ML/DL/transformer models on TweepFake and GPT-2 generated data using an 80/10/10 train/validation/test split. The key finding is that RoBERTa on raw TweepFake data achieves the highest balanced accuracy (~0.896) with strong F1, while GPT-2 based fakes are harder to detect, underscoring the need for robust, adaptable detectors. The work demonstrates effective strategies for social media misinformation detection and highlights ongoing challenges as language generation models evolve, with practical implications for digital information integrity.
Abstract
This study addresses the critical challenge of detecting DeepFake tweets by leveraging advanced natural language processing (NLP) techniques to distinguish between genuine and AI-generated texts. Given the increasing prevalence of misinformation, our research utilizes the TweepFake dataset to train and evaluate various machine learning models. The objective is to identify effective strategies for recognizing DeepFake content, thereby enhancing the integrity of digital communications. By developing reliable methods for detecting AI-generated misinformation, this work contributes to a more trustworthy online information environment.
