Hybrid Ensemble-Based Travel Mode Prediction
Paweł Golik, Maciej Grzenda, Elżbieta Sienkiewicz
TL;DR
The paper addresses travel mode choice prediction under concept drift in data streams by proposing IEBSM, an Incremental Ensemble of Batch and Stream Models that integrates drift detectors with a heterogeneous mix of batch and online learners and shadow-model retraining. Through experiments on diverse city- and country-scale travel datasets, IEBSM detects drift and adaptively replaces models, outperforming standalone batch or online methods and achieving the best overall rankings. The approach mitigates the difficulty of selecting a single learning paradigm in non-stationary settings and supports continuous monitoring and retraining for dynamic travel-behavior prediction. This has practical implications for robust, drift-aware transportation planning and policy support in evolving urban environments.
Abstract
Travel mode choice (TMC) prediction, which can be formulated as a classification task, helps in understanding what makes citizens choose different modes of transport for individual trips. This is also a major step towards fostering sustainable transportation. As behaviour may evolve over time, we also face the question of detecting concept drift in the data. This necessitates using appropriate methods to address potential concept drift. In particular, it is necessary to decide whether batch or stream mining methods should be used to develop periodically updated TMC models. To address the challenge of the development of TMC models, we propose the novel Incremental Ensemble of Batch and Stream Models (IEBSM) method aimed at adapting travel mode choice classifiers to concept drift possibly occurring in the data. It relies on the combination of drift detectors with batch learning and stream mining models. We compare it against batch and incremental learners, including methods relying on active drift detection. Experiments with varied travel mode data sets representing both city and country levels show that the IEBSM method both detects drift in travel mode data and successfully adapts the models to evolving travel mode choice data. The method has a higher rank than batch and stream learners.
