Table of Contents
Fetching ...

Cross-Platform Digital Discourse Analysis of Iran: Topics, Sentiment, Polarization, and Event Validation on Telegram and Reddit

Despoina Antonakaki, Sotiris Ioannidis

Abstract

We analyze Iran-related discourse across two structurally different platforms: Telegram (7,567 messages from international news channels) and Reddit (23,909 posts and comments from Iran-focused and global communities). Using a single reproducible pipeline, we apply NMF topic modeling over TF--IDF features, VADER sentiment scoring, and a keyword-bundle escalation index capturing military, nuclear, and diplomatic narratives. To assess whether discourse dynamics track offline developments, we compare escalation time series with external protest and geopolitical event timelines using same-day and lagged correlation analysis. Same-day correlations are weak, but the strongest relationships occur at non-zero lags, consistent with anticipatory or reactive framing rather than instantaneous mirroring. Finally, using a separate real-time collection (February 2026), we observe synchronized increases in escalation-related narratives that coincide with documented geopolitical developments. Overall, the results show systematic cross-platform differences in narrative structure and tone, and provide quantitative evidence that online escalation signals can align with real-world developments with measurable temporal offsets.

Cross-Platform Digital Discourse Analysis of Iran: Topics, Sentiment, Polarization, and Event Validation on Telegram and Reddit

Abstract

We analyze Iran-related discourse across two structurally different platforms: Telegram (7,567 messages from international news channels) and Reddit (23,909 posts and comments from Iran-focused and global communities). Using a single reproducible pipeline, we apply NMF topic modeling over TF--IDF features, VADER sentiment scoring, and a keyword-bundle escalation index capturing military, nuclear, and diplomatic narratives. To assess whether discourse dynamics track offline developments, we compare escalation time series with external protest and geopolitical event timelines using same-day and lagged correlation analysis. Same-day correlations are weak, but the strongest relationships occur at non-zero lags, consistent with anticipatory or reactive framing rather than instantaneous mirroring. Finally, using a separate real-time collection (February 2026), we observe synchronized increases in escalation-related narratives that coincide with documented geopolitical developments. Overall, the results show systematic cross-platform differences in narrative structure and tone, and provide quantitative evidence that online escalation signals can align with real-world developments with measurable temporal offsets.
Paper Structure (40 sections, 27 figures, 2 tables)

This paper contains 40 sections, 27 figures, 2 tables.

Figures (27)

  • Figure 1: Distributional properties of the dataset. Left: cumulative distribution of message lengths across all platforms. Right: cumulative distribution of message volumes per source, illustrating strong concentration of activity.
  • Figure 2: Top Telegram channels by message volume. A small number of international news channels dominate Iran-related content production on Telegram.
  • Figure 3: Message volumes per topic obtained using Non-negative Matrix Factorization (NMF). Topics are labeled using their most representative keywords. Several topics contain non-English (Persian) terms reflecting original-language discourse in the dataset.
  • Figure 4: Topic neighborhood networks computed from pairwise cosine similarity between topic representations. Each node corresponds to one NMF topic (labeled by its most representative keywords). Edges connect each topic to its nearest neighbors under cosine similarity; thicker edges indicate higher similarity. Dense regions indicate overlapping vocabulary and closely related narrative themes, while isolated nodes indicate more distinct topics.
  • Figure 5: Topic neighborhood network computed on the combined Telegram+Reddit corpus. Nodes represent topics and edges denote nearest-neighbor cosine similarity between topic representations (thicker edges indicate higher similarity).
  • ...and 22 more figures