Streaming detection of significant delay changes in public transport systems
Przemysław Wrona, Maciej Grzenda, Marcin Luckner
TL;DR
Public transport delays arise from various disruptions and propagate across networks, hindering mobility choices. The authors propose Streaming Delay Change Detection (SDCD), a streaming method that applies change detectors to delay streams on edges or edge-hour bins, using detectors such as ADWIN, KS-WIN, and HDDM. They implement SDCD within an IoT-oriented USE4IoT architecture and validate it on Warsaw's AVL/GTFS data, including real-time GTFS integration and three OTP instances. The work demonstrates online detection of statistically significant delay changes, identifies a small subgraph driving most delays, and suggests SDCD-based scheduling as a way to improve reliability and planning in multimodal transit.
Abstract
Public transport systems are expected to reduce pollution and contribute to sustainable development. However, disruptions in public transport such as delays may negatively affect mobility choices. To quantify delays, aggregated data from vehicle locations systems are frequently used. However, delays observed at individual stops are caused inter alia by fluctuations in running times and propagation of delays occurring in other locations. Hence, in this work, we propose both the method detecting significant delays and reference architecture, relying on stream processing engines, in which the method is implemented. The method can complement the calculation of delays defined as deviation from schedules. This provides both online rather than batch identification of significant and repetitive delays, and resilience to the limited quality of location data. The method we propose can be used with different change detectors, such as ADWIN, applied to location data stream shuffled to individual edges of a transport graph. It can detect in an online manner at which edges statistically significant delays are observed and at which edges delays arise and are reduced. Detections can be used to model mobility choices and quantify the impact of repetitive rather than random disruptions on feasible trips with multimodal trip modelling engines. The evaluation performed with the public transport data of over 2000 vehicles confirms the merits of the method and reveals that a limited-size subgraph of a transport system graph causes statistically significant delays
