Table of Contents
Fetching ...

Interpretable Early Warnings using Machine Learning in an Online Game-experiment

Guillaume Falmagne, Anna B. Stephenson, Simon A. Levin

TL;DR

This work addresses the challenge of forecasting regime shifts in a large-scale online social system by building an interpretable, data-driven early-warning framework. It uses gradient-boosted decision trees trained on system-specific time-series with a $7\text{h}$ memory and interprets predictions via SHAP to reveal underlying drivers of transitions in Reddit's $r/place$ canvases. The model achieves a ROC AUC of $0.833$ and detects about half of incoming transitions within $20$ minutes at a false-positive rate of $3.7\%$, with robust generalization to the 2023 event (AUC around $0.69$ for a $6$-hour horizon). SHAP-based analysis uncovers 12 pre-transition behavioral patterns, including aspects of critical slowing down, innovation, and coordination, offering human-readable warnings that could inform monitoring in socio-ecological systems and other dense, high-dimensional domains.

Abstract

Stemming from physics and later applied to other fields such as ecology, the theory of critical transitions suggests that some regime shifts are preceded by statistical early warning signals. Reddit's r/place experiment, a large-scale social game, provides a unique opportunity to test these signals consistently across thousands of subsystems undergoing critical transitions. In r/place, millions of users collaboratively created compositions, or pixel-art drawings, in which transitions occur when one composition rapidly replaces another. We develop a machine-learning-based early warning system that combines the predictive power of multiple system-specific time series via gradient-boosted decision trees with memory-retaining features. Our method significantly outperforms standard early warning indicators. Trained on the 2022 r/place data, our algorithm detects half of the transitions occurring within 20 minutes at a false positive rate of just 3.7%. Its performance remains robust when tested on the 2023 r/place event, demonstrating generalizability across different contexts. Using SHapley Additive exPlanations (SHAP) for interpreting the predictions, we investigate the underlying drivers of warnings, which could be relevant to other complex systems, especially online social systems. We reveal an interplay of patterns preceding transitions, such as critical slowing down or speeding up, a lack of innovation or coordination, turbulent histories, and a lack of image complexity. These findings show the potential of machine learning indicators in socio-ecological systems for predicting regime shifts and understanding their dynamics.

Interpretable Early Warnings using Machine Learning in an Online Game-experiment

TL;DR

This work addresses the challenge of forecasting regime shifts in a large-scale online social system by building an interpretable, data-driven early-warning framework. It uses gradient-boosted decision trees trained on system-specific time-series with a memory and interprets predictions via SHAP to reveal underlying drivers of transitions in Reddit's canvases. The model achieves a ROC AUC of and detects about half of incoming transitions within minutes at a false-positive rate of , with robust generalization to the 2023 event (AUC around for a -hour horizon). SHAP-based analysis uncovers 12 pre-transition behavioral patterns, including aspects of critical slowing down, innovation, and coordination, offering human-readable warnings that could inform monitoring in socio-ecological systems and other dense, high-dimensional domains.

Abstract

Stemming from physics and later applied to other fields such as ecology, the theory of critical transitions suggests that some regime shifts are preceded by statistical early warning signals. Reddit's r/place experiment, a large-scale social game, provides a unique opportunity to test these signals consistently across thousands of subsystems undergoing critical transitions. In r/place, millions of users collaboratively created compositions, or pixel-art drawings, in which transitions occur when one composition rapidly replaces another. We develop a machine-learning-based early warning system that combines the predictive power of multiple system-specific time series via gradient-boosted decision trees with memory-retaining features. Our method significantly outperforms standard early warning indicators. Trained on the 2022 r/place data, our algorithm detects half of the transitions occurring within 20 minutes at a false positive rate of just 3.7%. Its performance remains robust when tested on the 2023 r/place event, demonstrating generalizability across different contexts. Using SHapley Additive exPlanations (SHAP) for interpreting the predictions, we investigate the underlying drivers of warnings, which could be relevant to other complex systems, especially online social systems. We reveal an interplay of patterns preceding transitions, such as critical slowing down or speeding up, a lack of innovation or coordination, turbulent histories, and a lack of image complexity. These findings show the potential of machine learning indicators in socio-ecological systems for predicting regime shifts and understanding their dynamics.

Paper Structure

This paper contains 4 sections, 6 figures.

Figures (6)

  • Figure 1: Description of the r/place game, its compositions, and the transitions they undergo. (a) The rules of the game for a given user. (b) Snapshots of the full 2022 canvas at multiple points in time; some parts of this canvas were available only later in the game. (c) Fraction of pixels differing from the reference image (diff pixels reference) for the "Chessboard" and "Star Wars: Episode IV -- A New Hope" compositions as they undergo transition. Insets show snapshots of the compositions at different points in time. Time is measured from the beginning of the event.
  • Figure 2: (a-s) The time-dependent variables used in the training of the algorithm, for the "Chessboard" composition. See text for explanations of all variables.
  • Figure 3: Workflow of our machine learning warning system. (a) Transitions at time $t^*$ and the associated target variable time-to-transition ($\Delta^{*}$) are identified based on changes in diff pixels reference. (b) The features for each time instance consist of each of 19 input variables recorded over 9-12 time ranges of a 7h memory, as well as 5 variables without memory. (c) Gradient-boosted decision trees are trained to predict the time-to-transition. (d) Predicted and true values of the time-to-transition are compared in the test sample. (e) The drivers of predictions are analyzed based on SHAP values at a given feature value.
  • Figure 4: Time-to-transition predictions. (a) Predicted time-to-transition $\Delta^{*}_{\textrm{pred}}$ versus the true values $\Delta^{*}_{\textrm{true}}$. The color shows the probability at a given $\Delta^{*}_{\textrm{true}}$ to predict a certain $\Delta^{*}_{\textrm{pred}}$ value. To accommodate display on the log axis, 100s is added to all time-to-transition values. Perfect predictions would align with the grey line. (b) Probability distribution functions, constructed by kernel density estimation, of $\Delta^{*}_{\textrm{pred}}$ at four different $\Delta^{*}_{\textrm{true}}\xspace\pm$ 5min values. The darkest curve includes all instances for which there is no incoming transition. (c) ROC area under the curve (AUC) as a function of the warning range for the machine learning warning signal (dark blue), a single-variable standard warning signal using variance (light blue), and the machine learning warning signal tested on 2023 r/place data (red). The inset shows, as an example, the ROC curve used to compute the ROC AUC for the 20min warning range.
  • Figure 5: Trajectories of predicted versus true time-to-transition for three compositions. The red line indicates a perfect prediction. The insets show the composition at different points in time. (a) The "Auburn University" composition gives high-accuracy predictions. (b) "1886" gives low-accuracy predictions with oscillations. (c) "Three Cheers For Sweet Revenge" gives low-accuracy predictions with relatively constant values.
  • ...and 1 more figures