Assembling a Multi-Platform Ensemble Social Bot Detector with Applications to US 2020 Elections
Lynnette Hui Xian Ng, Kathleen M. Carley
TL;DR
The paper introduces BotBuster For Everyone, a multi-platform ensemble detector that combines per-field, tree-based classifiers (with Platt scaling) into a threshold-free, aggregated prediction framework capable of handling incomplete data across Twitter, Reddit, and Instagram. It demonstrates improved cross-platform performance over BotHunter and Botometer, processing partial data and enabling analysis of bot activity without requiring complete feature sets. The authors show feature importance centering on username entropy and post engagement, and apply the method to US 2020 election discourse to reveal platform-specific bot prevalence and narrative themes. The approach offers scalable, interpretable bot detection suitable for cross-platform studies, with implications for real-time moderation and sociotechnical research across diverse social media ecosystems.
Abstract
Bots have been in the spotlight for many social media studies, for they have been observed to be participating in the manipulation of information and opinions on social media. These studies analyzed the activity and influence of bots in a variety of contexts: elections, protests, health communication and so forth. Prior to this analyses is the identification of bot accounts to segregate the class of social media users. In this work, we propose an ensemble method for bot detection, designing a multi-platform bot detection architecture to handle several problems along the bot detection pipeline: incomplete data input, minimal feature engineering, optimized classifiers for each data field, and also eliminate the need for a threshold value for classification determination. With these design decisions, we generalize our bot detection framework across Twitter, Reddit and Instagram. We also perform feature importance analysis, observing that the entropy of names and number of interactions (retweets/shares) are important factors in bot determination. Finally, we apply our multi-platform bot detector to the US 2020 presidential elections to identify and analyze bot activity across multiple social media platforms, showcasing the difference in online discourse of bots from different platforms.
