Conspiracy theories and where to find them on TikTok
Francesco Corso, Francesco Pierri, Gianmarco De Francisci Morales
TL;DR
This study quantitatively characterizes conspiracy theories on TikTok using a longitudinal U.S. dataset of about $1.605696\times 10^6$ long videos from $1.178303\times 10^6$ users collected via the TikTok Research API. It combines hashtag-enrichment with manual validation to identify conspiratorial content, estimates a lower-bound prevalence (up to about $1{,}000$ videos per month in 2023), and analyzes the Creativity Program's impact on overall video duration. It also evaluates open-weight LLMs (Llama3, Mistral, Gemma) for conspiracy detection from audio transcripts, showing high precision in some configurations (e.g., ~$0.96$ precision with Step-by-step prompts) but notable trade-offs compared to fine-tuned RoBERTa, underscoring both opportunities and limitations for scalable content moderation. The findings inform moderation strategies and policy design while highlighting the importance of prompt design, data quality, and resource considerations in deploying LLM-based detection systems at scale.
Abstract
TikTok has skyrocketed in popularity over recent years, especially among younger audiences. However, there are public concerns about the potential of this platform to promote and amplify harmful content. This study presents the first systematic analysis of conspiracy theories on TikTok. By leveraging the official TikTok Research API we collect a longitudinal dataset of 1.5M videos shared in the U.S. over three years. We estimate a lower bound on the prevalence of conspiratorial videos (up to 1000 new videos per month) and evaluate the effects of TikTok's Creativity Program for monetization, observing an overall increase in video duration regardless of content. Lastly, we evaluate the capabilities of state-of-the-art open-weight Large Language Models to identify conspiracy theories from audio transcriptions of videos. While these models achieve high precision in detecting harmful content (up to 96%), their overall performance remains comparable to fine-tuned traditional models such as RoBERTa. Our findings suggest that Large Language Models can serve as an effective tool for supporting content moderation strategies aimed at reducing the spread of harmful content on TikTok.
