Table of Contents
Fetching ...

Individual Bus Trip Chain Prediction and Pattern Identification Considering Similarities

Xiannan Huang, Yixin Chen, Quan Yuan, Chao Yang

TL;DR

This work tackles the challenge of predicting individual bus trip chains by moving beyond time-series models to a similarity-driven graph framework, where each day is a node and edge weights encode three key similarity patterns. The authors formalize a similarity function $\text{sim}(i,j)=a_1 x_{i,j,1}+a_2 x_{i,j,2}+a_3\frac{1}{\Delta(i,j)+1}$ and cast prediction as a graph-based semi-supervised classification problem, solved via Label Propagation or Graph Embedding with Random Forest, augmented by a Label Correlation Module to capture pairwise trip co-occurrences. Experiments on a real Shenzhen dataset of 10,000 users show the proposed methods achieving state-of-the-art performance across 1-, 7-, 14-, and 28-day horizons, with ablation studies confirming the essential role of each component. Hyperparameter analysis reveals distinct user types (Repeat Dominated, Repeat-Evolve, Evolve Dominated) and highlights the time-difference feature as particularly influential, offering new insights into daily travel patterns and potential extensions to broader human trajectory modeling.

Abstract

Predicting future bus trip chains for an existing user is of great significance for operators of public transit systems. Existing methods always treat this task as a time-series prediction problem, but the 1-dimensional time series structure cannot express the complex relationship between trips. To better capture the inherent patterns in bus travel behavior, this paper proposes a novel approach that synthesizes future bus trip chains based on those from similar days. Key similarity patterns are defined and tested using real-world data, and a similarity function is then developed to capture these patterns. Afterwards, a graph is constructed where each day is represented as a node and edge weight reflects the similarity between days. Besides, the trips on a given day can be regarded as labels for each node, transferring the bus trip chain prediction problem to a semi-supervised classification problem on a graph. To address this, we propose several methods and validate them on a real-world dataset of 10000 bus users, achieving state-of-the-art prediction results. Analyzing the parameters of similarity function reveals some interesting bus usage patterns, allowing us can to cluster bus users into three types: repeat-dominated, evolve-dominate and repeat-evolve balanced. In summary, our work demonstrates the effectiveness of similarity-based prediction for bus trip chains and provides a new perspective for analyzing individual bus travel patterns. The code for our prediction model is publicly available.

Individual Bus Trip Chain Prediction and Pattern Identification Considering Similarities

TL;DR

This work tackles the challenge of predicting individual bus trip chains by moving beyond time-series models to a similarity-driven graph framework, where each day is a node and edge weights encode three key similarity patterns. The authors formalize a similarity function and cast prediction as a graph-based semi-supervised classification problem, solved via Label Propagation or Graph Embedding with Random Forest, augmented by a Label Correlation Module to capture pairwise trip co-occurrences. Experiments on a real Shenzhen dataset of 10,000 users show the proposed methods achieving state-of-the-art performance across 1-, 7-, 14-, and 28-day horizons, with ablation studies confirming the essential role of each component. Hyperparameter analysis reveals distinct user types (Repeat Dominated, Repeat-Evolve, Evolve Dominated) and highlights the time-difference feature as particularly influential, offering new insights into daily travel patterns and potential extensions to broader human trajectory modeling.

Abstract

Predicting future bus trip chains for an existing user is of great significance for operators of public transit systems. Existing methods always treat this task as a time-series prediction problem, but the 1-dimensional time series structure cannot express the complex relationship between trips. To better capture the inherent patterns in bus travel behavior, this paper proposes a novel approach that synthesizes future bus trip chains based on those from similar days. Key similarity patterns are defined and tested using real-world data, and a similarity function is then developed to capture these patterns. Afterwards, a graph is constructed where each day is represented as a node and edge weight reflects the similarity between days. Besides, the trips on a given day can be regarded as labels for each node, transferring the bus trip chain prediction problem to a semi-supervised classification problem on a graph. To address this, we propose several methods and validate them on a real-world dataset of 10000 bus users, achieving state-of-the-art prediction results. Analyzing the parameters of similarity function reveals some interesting bus usage patterns, allowing us can to cluster bus users into three types: repeat-dominated, evolve-dominate and repeat-evolve balanced. In summary, our work demonstrates the effectiveness of similarity-based prediction for bus trip chains and provides a new perspective for analyzing individual bus travel patterns. The code for our prediction model is publicly available.

Paper Structure

This paper contains 26 sections, 15 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Different data organization method: from time series to graph.
  • Figure 2: Workflow of this paper
  • Figure 3: Process of validating Pattern 1
  • Figure 4: Similarity scores of all sets.
  • Figure 5: Average Similarity and time difference
  • ...and 2 more figures