Table of Contents
Fetching ...

Context-Aware Prediction of User Engagement on Online Social Platforms

Heinrich Peters, Yozen Liu, Francesco Barbieri, Raiyan Abdul Baten, Sandra C. Matz, Maarten W. Bos

TL;DR

This study investigates context-aware modeling to predict user engagement on online social platforms, addressing the privacy and resource costs of long behavioral histories. It applies stacked LSTM networks to a large Snapchat dataset (N ≈ 79k users, 30 days) with 183 behavioral and 56 context features, including connectivity, location, weather, and socio-demographics. The results show that active-passive engagement is predictable from past behavior (R^2 = 0.345) and that context features substantially boost performance (R^2 = 0.522), even with short histories (R^2 = 0.442) when momentary context is included. SHAP analysis reveals connectivity status and location as key drivers, indicating interactive, non-linear relationships and supporting habit-driven, context-contingent patterns; these findings point to more privacy-preserving, resource-efficient modeling and have practical implications for adaptive system design on social platforms.

Abstract

The success of online social platforms hinges on their ability to predict and understand user behavior at scale. Here, we present data suggesting that context-aware modeling approaches may offer a holistic yet lightweight and potentially privacy-preserving representation of user engagement on online social platforms. Leveraging deep LSTM neural networks to analyze more than 100 million Snapchat sessions from almost 80.000 users, we demonstrate that patterns of active and passive use are predictable from past behavior (R2=0.345) and that the integration of context features substantially improves predictive performance compared to the behavioral baseline model (R2=0.522). Features related to smartphone connectivity status, location, temporal context, and weather were found to capture non-redundant variance in user engagement relative to features derived from histories of in-app behaviors. Further, we show that a large proportion of variance can be accounted for with minimal behavioral histories if momentary context is considered (R2=0.442). These results indicate the potential of context-aware approaches for making models more efficient and privacy-preserving by reducing the need for long data histories. Finally, we employ model explainability techniques to glean preliminary insights into the underlying behavioral mechanisms. Our findings are consistent with the notion of context-contingent, habit-driven patterns of active and passive use, underscoring the value of contextualized representations of user behavior for predicting user engagement on social platforms.

Context-Aware Prediction of User Engagement on Online Social Platforms

TL;DR

This study investigates context-aware modeling to predict user engagement on online social platforms, addressing the privacy and resource costs of long behavioral histories. It applies stacked LSTM networks to a large Snapchat dataset (N ≈ 79k users, 30 days) with 183 behavioral and 56 context features, including connectivity, location, weather, and socio-demographics. The results show that active-passive engagement is predictable from past behavior (R^2 = 0.345) and that context features substantially boost performance (R^2 = 0.522), even with short histories (R^2 = 0.442) when momentary context is included. SHAP analysis reveals connectivity status and location as key drivers, indicating interactive, non-linear relationships and supporting habit-driven, context-contingent patterns; these findings point to more privacy-preserving, resource-efficient modeling and have practical implications for adaptive system design on social platforms.

Abstract

The success of online social platforms hinges on their ability to predict and understand user behavior at scale. Here, we present data suggesting that context-aware modeling approaches may offer a holistic yet lightweight and potentially privacy-preserving representation of user engagement on online social platforms. Leveraging deep LSTM neural networks to analyze more than 100 million Snapchat sessions from almost 80.000 users, we demonstrate that patterns of active and passive use are predictable from past behavior (R2=0.345) and that the integration of context features substantially improves predictive performance compared to the behavioral baseline model (R2=0.522). Features related to smartphone connectivity status, location, temporal context, and weather were found to capture non-redundant variance in user engagement relative to features derived from histories of in-app behaviors. Further, we show that a large proportion of variance can be accounted for with minimal behavioral histories if momentary context is considered (R2=0.442). These results indicate the potential of context-aware approaches for making models more efficient and privacy-preserving by reducing the need for long data histories. Finally, we employ model explainability techniques to glean preliminary insights into the underlying behavioral mechanisms. Our findings are consistent with the notion of context-contingent, habit-driven patterns of active and passive use, underscoring the value of contextualized representations of user behavior for predicting user engagement on social platforms.
Paper Structure (15 sections, 3 figures)

This paper contains 15 sections, 3 figures.

Figures (3)

  • Figure 1: Overview of the predictive performance of different model specifications. RQ1: Model trained on behavioral histories only. RQ2: Model specifications used to assess the performance increment due to different sets of context features, including socio-demographic context (Model 2), weather (Model 3), temporal context (Model 4), location visits (Model 5), network connectivity status (Model 6), and all context features (Model 7). RQ3: Predictive performance of dense neural network models trained on cross-sectional data. Model 8 was trained on only behavioral data from t-1. Model 9 was trained on behavioral data from t-1 and context features from t0. For more details, please see SI 5.
  • Figure 2: Model performance as a function of sequence length for behavioral histories (blue) and context histories (orange). The curves show diminishing returns to adding additional time steps to the model. Models built on behavioral histories benefit more from long time series compared to models built on context histories.
  • Figure 3: Mean absolute SHAP values (panel A), distribution of individual SHAP values colored by underlying feature values (panel B), and correlations between feature values and SHAP values, as well as feature values and target values (panel C). Connectivity status, online behaviors in the previous session, location visits, and weather features showed the largest impact on predictions.