Data Mining in Transportation Networks with Graph Neural Networks: A Review and Outlook
Jiawei Xue, Ruichen Tan, Jianzhu Ma, Satish V. Ukkusuri
TL;DR
This review surveys the use of Graph Neural Networks (GNNs) in data mining for transportation networks (DMTN), articulating how GNNs capture spatial dependencies across graph-structured transportation entities to improve traffic prediction, operation, and industry-enabled travel-time estimation. It maps the evolution from classic GCN/GraphSAGE/GAT architectures to variants and integrations with large language models, and it highlights interpretability approaches such as GNNExplainer and Traffexplainer. The paper contrasts academic studies with industrial deployments (e.g., Google Maps, Baidu/Amap), discusses current datasets and open-code resources, and outlines future opportunities—emphasizing probabilistic/interval prediction, robustness under special scenarios, physics-informed GNNs, and the need for standardized benchmarks. By synthesizing recent advances (primarily since 2023–2024), it aims to democratize access to data and methods, accelerating GNN-driven breakthroughs in both prediction and operation tasks across diverse transportation modes and geographies.
Abstract
Data mining in transportation networks (DMTNs) refers to using diverse types of spatio-temporal data for various transportation tasks, including pattern analysis, traffic prediction, and traffic controls. Graph neural networks (GNNs) are essential in many DMTN problems due to their capability to represent spatial correlations between entities. Between 2016 and 2024, the notable applications of GNNs in DMTNs have extended to multiple fields such as traffic prediction and operation. However, existing reviews have primarily focused on traffic prediction tasks. To fill this gap, this study provides a timely and insightful summary of GNNs in DMTNs, highlighting new progress in prediction and operation from academic and industry perspectives since 2023. First, we present and analyze various DMTN problems, followed by classical and recent GNN models. Second, we delve into key works in three areas: (1) traffic prediction, (2) traffic operation, and (3) industry involvement, such as Google Maps, Amap, and Baidu Maps. Along these directions, we discuss new research opportunities based on the significance of transportation problems and data availability. Finally, we compile resources such as data, code, and other learning materials to foster interdisciplinary communication. This review, driven by recent trends in GNNs in DMTN studies since 2023, could democratize abundant datasets and efficient GNN methods for various transportation problems including prediction and operation.
