Topic Shifts as a Proxy for Assessing Politicization in Social Media
Marcelo Sartori Locatelli, Pedro Calais, Matheus Prado Miranda, João Pedro Junho, Tomas Lacerda Muniz, Wagner Meira, Virgilio Almeida
TL;DR
This paper tackles the problem of measuring politicization in social media by treating topic shifts—from non-political to political discussions—as a proxy for politicization. It introduces a seed-based Positive-Unlabeled (PU) learning framework, including a two-step procedure with spies to identify reliable negatives and a subsequent XGBoost classifier trained on word2vec features, achieving about $F1$ scores of $0.86$ for news posts and $0.80$ for comments. The approach is applied to multi-platform data (Twitter, YouTube, TikTok) from Brazil's 2022 elections, revealing widespread politicization across both hard and soft topics, with notable temporal spikes around political events. The work provides scalable, label-efficient insights into politicization, highlights platform and topic differences, and proposes avenues to relate politicization to polarization and user-level dynamics in future research.
Abstract
Politicization is a social phenomenon studied by political science characterized by the extent to which ideas and facts are given a political tone. A range of topics, such as climate change, religion and vaccines has been subject to increasing politicization in the media and social media platforms. In this work, we propose a computational method for assessing politicization in online conversations based on topic shifts, i.e., the degree to which people switch topics in online conversations. The intuition is that topic shifts from a non-political topic to politics are a direct measure of politicization -- making something political, and that the more people switch conversations to politics, the more they perceive politics as playing a vital role in their daily lives. A fundamental challenge that must be addressed when one studies politicization in social media is that, a priori, any topic may be politicized. Hence, any keyword-based method or even machine learning approaches that rely on topic labels to classify topics are expensive to run and potentially ineffective. Instead, we learn from a seed of political keywords and use Positive-Unlabeled (PU) Learning to detect political comments in reaction to non-political news articles posted on Twitter, YouTube, and TikTok during the 2022 Brazilian presidential elections. Our findings indicate that all platforms show evidence of politicization as discussion around topics adjacent to politics such as economy, crime and drugs tend to shift to politics. Even the least politicized topics had the rate in which their topics shift to politics increased in the lead up to the elections and after other political events in Brazil -- an evidence of politicization.
