Table of Contents
Fetching ...

Using Graph Neural Networks to Predict Local Culture

Thiago H Silva, Daniel Silver

TL;DR

This paper develops a graph neural network (GNN) framework to predict local cultural dimensions of neighbourhoods by integrating three data streams: area socio-economic information, mobility graphs derived from Yelp review co-location, and group profiles inferred from reviewers’ venue tastes. By modeling cities as graphs where vertices are neighbourhoods and edges encode cross-neighbourhood movements and group interactions, the authors compare eight graph-structured scenarios built on different feature subsets and evaluate their predictive power for 15 cultural dimensions. Key findings show that either area-level socio-economic data or Yelp-derived group profiles achieve the strongest predictive performance, while mobility connectivity alone provides limited value; combining data sources does not consistently improve results. The work demonstrates the potential of GNNs to fuse diverse data sources for urban research and highlights practical implications for cases with scarce census data, while also outlining avenues for richer group profiling and multi-city graph analyses.

Abstract

Urban research has long recognized that neighbourhoods are dynamic and relational. However, lack of data, methodologies, and computer processing power have hampered a formal quantitative examination of neighbourhood relational dynamics. To make progress on this issue, this study proposes a graph neural network (GNN) approach that permits combining and evaluating multiple sources of information about internal characteristics of neighbourhoods, their past characteristics, and flows of groups among them, potentially providing greater expressive power in predictive models. By exploring a public large-scale dataset from Yelp, we show the potential of our approach for considering structural connectedness in predicting neighbourhood attributes, specifically to predict local culture. Results are promising from a substantive and methodologically point of view. Substantively, we find that either local area information (e.g. area demographics) or group profiles (tastes of Yelp reviewers) give the best results in predicting local culture, and they are nearly equivalent in all studied cases. Methodologically, exploring group profiles could be a helpful alternative where finding local information for specific areas is challenging, since they can be extracted automatically from many forms of online data. Thus, our approach could empower researchers and policy-makers to use a range of data sources when other local area information is lacking.

Using Graph Neural Networks to Predict Local Culture

TL;DR

This paper develops a graph neural network (GNN) framework to predict local cultural dimensions of neighbourhoods by integrating three data streams: area socio-economic information, mobility graphs derived from Yelp review co-location, and group profiles inferred from reviewers’ venue tastes. By modeling cities as graphs where vertices are neighbourhoods and edges encode cross-neighbourhood movements and group interactions, the authors compare eight graph-structured scenarios built on different feature subsets and evaluate their predictive power for 15 cultural dimensions. Key findings show that either area-level socio-economic data or Yelp-derived group profiles achieve the strongest predictive performance, while mobility connectivity alone provides limited value; combining data sources does not consistently improve results. The work demonstrates the potential of GNNs to fuse diverse data sources for urban research and highlights practical implications for cases with scarce census data, while also outlining avenues for richer group profiling and multi-city graph analyses.

Abstract

Urban research has long recognized that neighbourhoods are dynamic and relational. However, lack of data, methodologies, and computer processing power have hampered a formal quantitative examination of neighbourhood relational dynamics. To make progress on this issue, this study proposes a graph neural network (GNN) approach that permits combining and evaluating multiple sources of information about internal characteristics of neighbourhoods, their past characteristics, and flows of groups among them, potentially providing greater expressive power in predictive models. By exploring a public large-scale dataset from Yelp, we show the potential of our approach for considering structural connectedness in predicting neighbourhood attributes, specifically to predict local culture. Results are promising from a substantive and methodologically point of view. Substantively, we find that either local area information (e.g. area demographics) or group profiles (tastes of Yelp reviewers) give the best results in predicting local culture, and they are nearly equivalent in all studied cases. Methodologically, exploring group profiles could be a helpful alternative where finding local information for specific areas is challenging, since they can be extracted automatically from many forms of online data. Thus, our approach could empower researchers and policy-makers to use a range of data sources when other local area information is lacking.
Paper Structure (20 sections, 5 figures, 2 tables)

This paper contains 20 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Overall workflow of the study. Steps were numbered to indicate precedence order. Proceedings represented in Step 1 refer to data collection. Whereas those represented in Step 2 refer to information extraction on LBSN data. Steps 1 and 2 are necessary to build mobility graphs (Step 3), which, in turn, are explored in the cultural dimension predictions (Step 4). [Best in colour]
  • Figure 2: Illustration of our mobility graph. Both figures represent the same city. On the left, we have data for year $y$, and, on the right, year $y+1$. For each year, we consider the same areas (vertices), shown as red circles. Note that areas can have attributes (i.e. the vertex attributes of the graphs), which are represented by the symbols triangle, square and star, where the size indicates the importance of the attribute. Crosses inside areas represent urban cultural dimensions. Edges can express the number of people that visited both areas (in all graphs) and the type of people that visited both areas, represented by the stick figure's colour (i.e. the edge attributes of the graph). [Best in colour]
  • Figure 3: Diagram depicting the division of our dataset and procedures performed. This example assumes that we want to predict local cultural dimensions for the year 2015, thus, it becomes our test set. In this example, we train our model using all previous years: 2011 to 2014. One epoch (one complete cycle) in the training phase comprises learning and validation steps with all consecutive pairs of years, alternating learning and validation in each step, as depicted in the detailed part of the training set in the figure.
  • Figure 4: Prediction errors (RMSE) for different scenarios (graphs) – the higher, the worse. Scenario None, a graph without census information, group profile and mobility information, presents the worst results for all cities. Note that we zoomed in on the y-axis to favour legibility. Note on group profiles: for Calgary and Montreal, we found five group profiles considering five topics; for Toronto, we found four group profiles considering seven topics. Recall that the GNN scenarios names are represented in Table \ref{['tabNamesNets']} -- for instance, "area info" is a scenario that only includes vertex features $D$.
  • Figure 5: Prediction errors (RMSE) for each FSA in Toronto. Errors are presented in scenario "Area Info." [Best in colour]