Table of Contents
Fetching ...

Data Mining in Transportation Networks with Graph Neural Networks: A Review and Outlook

Jiawei Xue, Ruichen Tan, Jianzhu Ma, Satish V. Ukkusuri

TL;DR

This review surveys the use of Graph Neural Networks (GNNs) in data mining for transportation networks (DMTN), articulating how GNNs capture spatial dependencies across graph-structured transportation entities to improve traffic prediction, operation, and industry-enabled travel-time estimation. It maps the evolution from classic GCN/GraphSAGE/GAT architectures to variants and integrations with large language models, and it highlights interpretability approaches such as GNNExplainer and Traffexplainer. The paper contrasts academic studies with industrial deployments (e.g., Google Maps, Baidu/Amap), discusses current datasets and open-code resources, and outlines future opportunities—emphasizing probabilistic/interval prediction, robustness under special scenarios, physics-informed GNNs, and the need for standardized benchmarks. By synthesizing recent advances (primarily since 2023–2024), it aims to democratize access to data and methods, accelerating GNN-driven breakthroughs in both prediction and operation tasks across diverse transportation modes and geographies.

Abstract

Data mining in transportation networks (DMTNs) refers to using diverse types of spatio-temporal data for various transportation tasks, including pattern analysis, traffic prediction, and traffic controls. Graph neural networks (GNNs) are essential in many DMTN problems due to their capability to represent spatial correlations between entities. Between 2016 and 2024, the notable applications of GNNs in DMTNs have extended to multiple fields such as traffic prediction and operation. However, existing reviews have primarily focused on traffic prediction tasks. To fill this gap, this study provides a timely and insightful summary of GNNs in DMTNs, highlighting new progress in prediction and operation from academic and industry perspectives since 2023. First, we present and analyze various DMTN problems, followed by classical and recent GNN models. Second, we delve into key works in three areas: (1) traffic prediction, (2) traffic operation, and (3) industry involvement, such as Google Maps, Amap, and Baidu Maps. Along these directions, we discuss new research opportunities based on the significance of transportation problems and data availability. Finally, we compile resources such as data, code, and other learning materials to foster interdisciplinary communication. This review, driven by recent trends in GNNs in DMTN studies since 2023, could democratize abundant datasets and efficient GNN methods for various transportation problems including prediction and operation.

Data Mining in Transportation Networks with Graph Neural Networks: A Review and Outlook

TL;DR

This review surveys the use of Graph Neural Networks (GNNs) in data mining for transportation networks (DMTN), articulating how GNNs capture spatial dependencies across graph-structured transportation entities to improve traffic prediction, operation, and industry-enabled travel-time estimation. It maps the evolution from classic GCN/GraphSAGE/GAT architectures to variants and integrations with large language models, and it highlights interpretability approaches such as GNNExplainer and Traffexplainer. The paper contrasts academic studies with industrial deployments (e.g., Google Maps, Baidu/Amap), discusses current datasets and open-code resources, and outlines future opportunities—emphasizing probabilistic/interval prediction, robustness under special scenarios, physics-informed GNNs, and the need for standardized benchmarks. By synthesizing recent advances (primarily since 2023–2024), it aims to democratize access to data and methods, accelerating GNN-driven breakthroughs in both prediction and operation tasks across diverse transportation modes and geographies.

Abstract

Data mining in transportation networks (DMTNs) refers to using diverse types of spatio-temporal data for various transportation tasks, including pattern analysis, traffic prediction, and traffic controls. Graph neural networks (GNNs) are essential in many DMTN problems due to their capability to represent spatial correlations between entities. Between 2016 and 2024, the notable applications of GNNs in DMTNs have extended to multiple fields such as traffic prediction and operation. However, existing reviews have primarily focused on traffic prediction tasks. To fill this gap, this study provides a timely and insightful summary of GNNs in DMTNs, highlighting new progress in prediction and operation from academic and industry perspectives since 2023. First, we present and analyze various DMTN problems, followed by classical and recent GNN models. Second, we delve into key works in three areas: (1) traffic prediction, (2) traffic operation, and (3) industry involvement, such as Google Maps, Amap, and Baidu Maps. Along these directions, we discuss new research opportunities based on the significance of transportation problems and data availability. Finally, we compile resources such as data, code, and other learning materials to foster interdisciplinary communication. This review, driven by recent trends in GNNs in DMTN studies since 2023, could democratize abundant datasets and efficient GNN methods for various transportation problems including prediction and operation.

Paper Structure

This paper contains 37 sections, 2 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: The framework of this survey. The survey begins by presenting data mining problems in transportation and various GNN models (Sections 2 and 3). Subsequently, we delve into current and prospective research on traffic prediction, operations, and industry-driven applications (Sections 4 and 5). Finally, we discuss data and code collections, followed by a conclusion (Sections 6 and 7).
  • Figure 2: Three mechanisms in GNN models: neighborhood aggregation, linear transformation, and nonlinear activation. These mechanisms collectively transform node embeddings, which are numerical vectors associated with the nodes, from layer $l$ to layer $(l+1)$. Such information passing allows node embeddings to capture the graph's topology.
  • Figure 3: Prominent GNN models from 2016 to 2024. For models in the same year, the vertical positionings do not adhere to explicit criteria. Readers can navigate research articles for each model by referring to their model abbreviations.
  • Figure 4: Two approaches to integrate GNNs and LLMs. The first approach employs GNNs to produce network topology-aware node embeddings and feeds them into afterward LLMs perozzi2024let. In contrast, the second approach leverages knowledge from pre-trained LLMs to enhance downstream GNNs wei2024llmrec, mitigating information deficiency during GNN training.
  • Figure 5: Applications of GNNs in traffic operation. (a) GNNs can generate route solutions by incorporating road network topology with well-designed loss functions min2024unsupervised. (b) GNNs act as reward function generators for route optimization liu2022personalized.
  • ...and 1 more figures