Analyzing and Estimating Support for U.S. Presidential Candidates in Twitter Polls
Stephen Scarano, Vijayalakshmi Vasudevan, Chhandak Bagchi, Mattia Samory, JungHwan Yang, Przemyslaw A. Grabowicz
TL;DR
This paper investigates Twitter polls as a data source for gauging public opinion on U.S. presidential candidates during the 2016 and 2020 campaigns. It combines a large Twitter-poll corpus with inferred user attributes (age, gender, ideology, location, bot-likeness) and validates these inferences against human judgments, then uses regression and model-based poststratification to correct for biases. The study finds systematic biases in social polls—candidate ordering, demographic overrepresentation, and bot activity—leading to inflated Trump support relative to traditional polls, but shows that bias-corrected poststratified estimates align closely with election outcomes (errors near 1–2%). These results demonstrate the potential of social polls to complement mainstream polls, provided robust bias-correction methods are applied, while also underscoring ethical considerations around platform transparency and data privacy.
Abstract
Polls posted on social media have emerged in recent years as an important tool for estimating public opinion, e.g., to gauge public support for business decisions and political candidates in national elections. Here, we examine nearly two thousand Twitter polls gauging support for U.S. presidential candidates during the 2016 and 2020 election campaigns. First, we describe the rapidly emerging prevalence of social polls. Second, we characterize social polls in terms of their heterogeneity and response options. Third, leveraging machine learning models for user attribute inference, we describe the demographics, political leanings, and other characteristics of the users who author and interact with social polls. Finally, we study the relationship between social poll results, their attributes, and the characteristics of users interacting with them. Our findings reveal that Twitter polls are biased in various ways, starting from the position of the presidential candidates among the poll options to biases in demographic attributes and poll results. The 2016 and 2020 polls were predominantly crafted by older males and manifested a pronounced bias favoring candidate Donald Trump, in contrast to traditional surveys, which favored Democratic candidates. We further identify and explore the potential reasons for such biases in social polling and discuss their potential repercussions. Finally, we show that biases in social media polls can be corrected via regression and poststratification. The errors of the resulting election estimates can be as low as 1%-2%, suggesting that social media polls can become a promising source of information about public opinion.
