Multi-Graph Inductive Representation Learning for Large-Scale Urban Rail Demand Prediction under Disruptions
Dang Viet Anh Nguyen, J. Victor Flensburg, Fabrizio Cerreto, Bianca Pascariu, Paola Pellegrini, Carlos Lima Azevedo, Filipe Rodrigues
TL;DR
This work tackles short-term OD demand prediction in large Urban Rail Transit (URT) networks under operational disruptions by modeling each OD pair as a node on multiple graphs and applying inductive representation learning. The proposed method, mGraphSAGE, builds four graphs capturing temporal, spatial, and distance-based correlations ($G_t$, $G_s$, $G_o$, $G_d$) and learns node embeddings via two-layer GraphSAGE in each graph, followed by concatenation and a linear predictor, enabling scalable predictions for all OD pairs in large networks. Key contributions include a detailed OD feature scheme (tendency, periodicity, node-id, and 12 reliability features), four adjacency constructions with explicit thresholds, and extensive empirical evaluation on Copenhagen’s S-train data showing improved robustness under delays and cancellations, especially for larger graphs. The results demonstrate that multi-graph inductive learning can effectively leverage reliability and relational structure to predict OD demand when real-time observability is partial and disruptions are present, offering practical value for planning and operations in growing URT systems.
Abstract
With the expansion of cities over time, URT (Urban Rail Transit) networks have also grown significantly. Demand prediction plays an important role in supporting planning, scheduling, fleet management, and other operational decisions. In this study, we propose an Origin-Destination (OD) demand prediction model called Multi-Graph Inductive Representation Learning (mGraphSAGE) for large-scale URT networks under operational uncertainties. Our main contributions are twofold: we enhance prediction results while ensuring scalability for large networks by relying simultaneously on multiple graphs, where each OD pair is a node on a graph and distinct OD relationships, such as temporal and spatial correlations; we show the importance of including operational uncertainties such as train delays and cancellations as inputs in demand prediction for daily operations. The model is validated on three different scales of the URT network in Copenhagen, Denmark. Experimental results show that by leveraging information from neighboring ODs and learning node representations via sampling and aggregation, mGraphSAGE is particularly suitable for OD demand prediction in large-scale URT networks, outperforming reference machine learning methods. Furthermore, during periods with train cancellations and delays, the performance gap between mGraphSAGE and other methods improves compared to normal operating conditions, demonstrating its ability to leverage system reliability information for predicting OD demand under uncertainty.
