Reading Between the Tweets: Deciphering Ideological Stances of Interconnected Mixed-Ideology Communities

Zihao He; Ashwin Rao; Siyi Guo; Negar Mokhberian; Kristina Lerman

Reading Between the Tweets: Deciphering Ideological Stances of Interconnected Mixed-Ideology Communities

Zihao He, Ashwin Rao, Siyi Guo, Negar Mokhberian, Kristina Lerman

TL;DR

This work tackles the challenge of uncovering nuanced ideological stances in interconnected online communities beyond a liberal/conservative dichotomy. It introduces a graph-informed framework that finetunes a per-community GPT-2 model on community-specific tweets and uses message passing across a community retweet network to incorporate neighboring communities’ viewpoints, with evaluation against the ANES 2020 ground truth. The approach outperforms baselines on target-specific stance ranking and demonstrates robustness through ablations, highlighting the value of inter-community information flow for ideology probing. The findings suggest that language-models, when guided by network structure, can reveal complex, mixed-ideology dynamics in digital discourse and offer a scalable tool for studying political attitudes online, while acknowledging platform- and time-related limitations and ethical considerations.

Abstract

Recent advances in NLP have improved our ability to understand the nuanced worldviews of online communities. Existing research focused on probing ideological stances treats liberals and conservatives as separate groups. However, this fails to account for the nuanced views of the organically formed online communities and the connections between them. In this paper, we study discussions of the 2020 U.S. election on Twitter to identify complex interacting communities. Capitalizing on this interconnectedness, we introduce a novel approach that harnesses message passing when finetuning language models (LMs) to probe the nuanced ideologies of these communities. By comparing the responses generated by LMs and real-world survey results, our method shows higher alignment than existing baselines, highlighting the potential of using LMs in revealing complex ideologies within and across interconnected mixed-ideology communities.

Reading Between the Tweets: Deciphering Ideological Stances of Interconnected Mixed-Ideology Communities

TL;DR

Abstract

Paper Structure (26 sections, 1 equation, 6 figures, 3 tables)

This paper contains 26 sections, 1 equation, 6 figures, 3 tables.

Introduction
Related Work
Data
ANES Survey
2020 U.S. Election Twitter Data
Exploring Ad-hoc Online Communities
Communities in Co-sharing Network
Mixed Ideologies of Online Communities
Interactions between Online Communities
Probing Stances of Online Communities
Methodology
Finetuning Language Model.
Message Passing between Community Corpora.
Evaluation Protocol
Baselines
...and 11 more sections

Figures (6)

Figure 1: Illustration of online communities, where colors of users represent their political ideologies. (a) Idealized online communities that are disconnected and have unified political ideologies. (b) Real-world online communities that are interconnected and have mixed political ideologies covering the full political spectrum. Links between them signify the flow of information and interaction, such as retweeting.
Figure 2: News co-sharing network. A link exists between a user and a news outlet if the user has shared links to articles from the outlet in their tweets. Users having similar news feed are likely from the same online communities.
Figure 3: Illustration of message passing of community $C_1$ in a simplified retweet network with three communities. The source node of an edge is the retweeting community, and the target node is the retweeted community. $D_1$ (the corpus of $C_1$) contains 100 tweets, where the fraction of liberal and conservative tweets are 0.7 and 0.3 respectively. The normalized out degrees for community $C_1$ are shown on its out edges. At each step of message passing, community $C_1$ exchanges information and updates its corpus with its neighboring communities including itself, based on its retweeting activities. The numbers of liberal and conservatives tweets sampled from the neighbors are based on the existing ration within $C_1$.
Figure 4: Illustration of target-specific community ranking and community-specific target ranking using a toy example with three communities and three targets.
Figure 5: Spearman's rank correlation coefficients using Prompt 4 ("X is/are the") for 10 targets and 5 communities of the finetuned GPT-2 baseline and our method on two ranking tasks. The targets/communities are the ones with the largest coefficient change between the two methods, either positively or negatively. From left to right, the targets/communities are sorted by the magnitude of their performance changes.
...and 1 more figures

Reading Between the Tweets: Deciphering Ideological Stances of Interconnected Mixed-Ideology Communities

TL;DR

Abstract

Reading Between the Tweets: Deciphering Ideological Stances of Interconnected Mixed-Ideology Communities

Authors

TL;DR

Abstract

Table of Contents

Figures (6)