Table of Contents
Fetching ...

STContext: A Multifaceted Dataset for Developing Context-aware Spatio-temporal Crowd Mobility Prediction Models

Liyue Chen, Jiangyi Fang, Tengfei Liu, Fangyuan Gao, Leye Wang

TL;DR

STContext introduces a comprehensive, open-source dataset infrastructure for context-aware spatio-temporal crowd mobility prediction, unifying nine datasets across five STCFP tasks with ten contextual features, including forecasted weather and POI data. It provides a principled workflow consisting of feature transformation, dependency modeling, representation fusion, and training strategies to incorporate context into deep STCFP models. Through extensive experiments, the paper shows that scenario-specific metrics are essential, space-invariant context modeling often rivals space-varying approaches, and combining spatial and temporal dependencies yields the best performance, with forecast quality materially affecting gains. The work offers practical guidance for context feature selection and fusion, discusses limitations of public data sources, and positions STContext as a foundation for generalizable context modeling in urban mobility applications. The dataset and workflow are released at the STContext GitHub repository for community use and extension.

Abstract

In smart cities, context-aware spatio-temporal crowd flow prediction (STCFP) models leverage contextual features (e.g., weather) to identify unusual crowd mobility patterns and enhance prediction accuracy. However, the best practice for incorporating contextual features remains unclear due to inconsistent usage of contextual features in different papers. Developing a multifaceted dataset with rich types of contextual features and STCFP scenarios is crucial for establishing a principled context modeling paradigm. Existing open crowd flow datasets lack an adequate range of contextual features, which poses an urgent requirement to build a multifaceted dataset to fill these research gaps. To this end, we create STContext, a multifaceted dataset for developing context-aware STCFP models. Specifically, STContext provides nine spatio-temporal datasets across five STCFP scenarios and includes ten contextual features, including weather, air quality index, holidays, points of interest, road networks, etc. Besides, we propose a unified workflow for incorporating contextual features into deep STCFP methods, with steps including feature transformation, dependency modeling, representation fusion, and training strategies. Through extensive experiments, we have obtained several useful guidelines for effective context modeling and insights for future research. The STContext is open-sourced at https://github.com/Liyue-Chen/STContext.

STContext: A Multifaceted Dataset for Developing Context-aware Spatio-temporal Crowd Mobility Prediction Models

TL;DR

STContext introduces a comprehensive, open-source dataset infrastructure for context-aware spatio-temporal crowd mobility prediction, unifying nine datasets across five STCFP tasks with ten contextual features, including forecasted weather and POI data. It provides a principled workflow consisting of feature transformation, dependency modeling, representation fusion, and training strategies to incorporate context into deep STCFP models. Through extensive experiments, the paper shows that scenario-specific metrics are essential, space-invariant context modeling often rivals space-varying approaches, and combining spatial and temporal dependencies yields the best performance, with forecast quality materially affecting gains. The work offers practical guidance for context feature selection and fusion, discusses limitations of public data sources, and positions STContext as a foundation for generalizable context modeling in urban mobility applications. The dataset and workflow are released at the STContext GitHub repository for community use and extension.

Abstract

In smart cities, context-aware spatio-temporal crowd flow prediction (STCFP) models leverage contextual features (e.g., weather) to identify unusual crowd mobility patterns and enhance prediction accuracy. However, the best practice for incorporating contextual features remains unclear due to inconsistent usage of contextual features in different papers. Developing a multifaceted dataset with rich types of contextual features and STCFP scenarios is crucial for establishing a principled context modeling paradigm. Existing open crowd flow datasets lack an adequate range of contextual features, which poses an urgent requirement to build a multifaceted dataset to fill these research gaps. To this end, we create STContext, a multifaceted dataset for developing context-aware STCFP models. Specifically, STContext provides nine spatio-temporal datasets across five STCFP scenarios and includes ten contextual features, including weather, air quality index, holidays, points of interest, road networks, etc. Besides, we propose a unified workflow for incorporating contextual features into deep STCFP methods, with steps including feature transformation, dependency modeling, representation fusion, and training strategies. Through extensive experiments, we have obtained several useful guidelines for effective context modeling and insights for future research. The STContext is open-sourced at https://github.com/Liyue-Chen/STContext.
Paper Structure (46 sections, 17 figures, 12 tables)

This paper contains 46 sections, 17 figures, 12 tables.

Figures (17)

  • Figure 1: Taxonomy of contextual features.
  • Figure 2: Illustrative example of the proposed multi-source data fusion procedure.
  • Figure 3: Long-term trend and short-term periodicity.
  • Figure 4: Spatial heterogeneity of SO$_2$
  • Figure 5: Weather distribution over time dimension.
  • ...and 12 more figures