DEEP: A Discourse Evolution Engine for Predictions about Social Movements
Valerio La Gatta, Marco Postiglione, Jeremy Gilbert, Daniel W. Linna, Morgan Manella Greenfield, Aaron Shaw, V. S. Subrahmanian
TL;DR
DEEP addresses the challenge of forecasting social-movement discourse by modeling a multi-output, cross-platform time series that jointly predicts volume and discrete emotions. It formalizes the problem as predicting $\mathcal{S}_{t+\Delta}=(\mathbf{V}_{t+\Delta},\mathbf{E}_{t+\Delta},\mathbf{T}_{t+\Delta})$ from the historical trajectory $\mathcal{H}_t$ and journalist-defined key events $\mathcal{K}_{t:t+\Delta}$, using a transformer-based TimeSeriesTransformer to produce probabilistic forecasts with $p(\mathcal{S}_{t+\Delta})$ parameterized by a Student-t distribution. A large-scale #MeToo dataset is constructed from 433{,}016 Reddit posts and 121{,}849 news articles, with a multi-layer data extraction scheme (L0–L3) to capture explicit and semantically related discourse. Results show strong performance—particularly on news sources—with high precision/recall/F1 for emotion forecasting and capable short-term prediction, alongside meaningful medium-term signals from Reddit—demonstrating practical value for editorial planning and rapid coverage of evolving social movements.
Abstract
Numerous social movements (SMs) around the world help support the UN's Sustainable Development Goals (SDGs). Understanding how key events shape SMs is key to the achievement of the SDGs. We have developed SMART (Social Media Analysis & Reasoning Tool) to track social movements related to the SDGs. SMART was designed by a multidisciplinary team of AI researchers, journalists, communications scholars and legal experts. This paper describes SMART's transformer-based multivariate time series Discourse Evolution Engine for Predictions about Social Movements (DEEP) to predict the volume of future articles/posts and the emotions expressed. DEEP outputs probabilistic forecasts with uncertainty estimates, providing critical support for editorial planning and strategic decision-making. We evaluate DEEP with a case study of the #MeToo movement by creating a novel longitudinal dataset (433K Reddit posts and 121K news articles) from September 2024 to June 2025 that will be publicly released for research purposes upon publication of this paper.
