Table of Contents
Fetching ...

A Large-Scale Analysis on the Use of Arrival Time Prediction for Automated Shuttle Services in the Real World

Carolin Schmidt, Mathias Tygesen, Filipe Rodrigues

TL;DR

This study tackles arrival-time prediction for automated shuttles operating in real-world pilot deployments by separating dwell and running times and evaluating a spectrum of models, including a novel RF-GCN hierarchical approach. Using six European SHOW pilots, it demonstrates that dwell-time modeling largely governs overall AT accuracy, especially in low-traffic or speed-regulated contexts, while running-time improvements are more modest. The work provides a comprehensive benchmark across diverse operational settings, reveals context-dependent model usefulness, and highlights practical data-collection practices essential for reliable deployment. Collectively, the findings offer actionable guidance for data preparation, model choice, and feature design to support data-informed decisions in emerging automated public transport systems.

Abstract

Urban mobility is on the cusp of transformation with the emergence of shared, connected, and cooperative automated vehicles. Yet, for them to be accepted by customers, trust in their punctuality is vital. Many pilot initiatives operate without a fixed schedule, enhancing the importance of reliable arrival time (AT) predictions. This study presents an AT prediction system for automated shuttles, utilizing separate models for dwell and running time predictions, validated on real-world data from six cities. Alongside established methods such as XGBoost, we explore the benefits of leveraging spatial correlations using graph neural networks (GNN). To accurately handle the case of a shuttle bypassing a stop, we propose a hierarchical model combining a random forest classifier and a GNN. The results for the final AT prediction are promising, showing low errors even when predicting several stops ahead. Yet, no single model emerges as universally superior, and we provide insights into the characteristics of pilot sites that influence the model selection process and prediction performance. Finally, we identify dwell time prediction as the key determinant in overall AT prediction accuracy when automated shuttles are deployed in low-traffic areas or under regulatory speed limits. Our meta-analysis across six pilot sites in different cities provides insights into the current state of autonomous public transport prediction models and paves the way for more data-informed decision-making as the field advances.

A Large-Scale Analysis on the Use of Arrival Time Prediction for Automated Shuttle Services in the Real World

TL;DR

This study tackles arrival-time prediction for automated shuttles operating in real-world pilot deployments by separating dwell and running times and evaluating a spectrum of models, including a novel RF-GCN hierarchical approach. Using six European SHOW pilots, it demonstrates that dwell-time modeling largely governs overall AT accuracy, especially in low-traffic or speed-regulated contexts, while running-time improvements are more modest. The work provides a comprehensive benchmark across diverse operational settings, reveals context-dependent model usefulness, and highlights practical data-collection practices essential for reliable deployment. Collectively, the findings offer actionable guidance for data preparation, model choice, and feature design to support data-informed decisions in emerging automated public transport systems.

Abstract

Urban mobility is on the cusp of transformation with the emergence of shared, connected, and cooperative automated vehicles. Yet, for them to be accepted by customers, trust in their punctuality is vital. Many pilot initiatives operate without a fixed schedule, enhancing the importance of reliable arrival time (AT) predictions. This study presents an AT prediction system for automated shuttles, utilizing separate models for dwell and running time predictions, validated on real-world data from six cities. Alongside established methods such as XGBoost, we explore the benefits of leveraging spatial correlations using graph neural networks (GNN). To accurately handle the case of a shuttle bypassing a stop, we propose a hierarchical model combining a random forest classifier and a GNN. The results for the final AT prediction are promising, showing low errors even when predicting several stops ahead. Yet, no single model emerges as universally superior, and we provide insights into the characteristics of pilot sites that influence the model selection process and prediction performance. Finally, we identify dwell time prediction as the key determinant in overall AT prediction accuracy when automated shuttles are deployed in low-traffic areas or under regulatory speed limits. Our meta-analysis across six pilot sites in different cities provides insights into the current state of autonomous public transport prediction models and paves the way for more data-informed decision-making as the field advances.
Paper Structure (15 sections, 4 equations, 11 figures, 7 tables, 1 algorithm)

This paper contains 15 sections, 4 equations, 11 figures, 7 tables, 1 algorithm.

Figures (11)

  • Figure 1: Our segment-based approach. The route is divided into dwell and running time segments.
  • Figure 2: Construction of the graph for GNN predictions from the original route (left) to the final graph for dwell time (middle) and running time predictions (right)
  • Figure 3: Mean SHAP values for XGB running time prediction. Variables include: segm_id (segment ID), tod (time-of-day), temp (temperature), wspd (windspeed), dow (day-of-week), prcp (precipitation)
  • Figure 4: Mean SHAP values for dwell time prediction with XGB.
  • Figure 5: Dwell time Distribution Linköping
  • ...and 6 more figures