Traffic flow forecasting, STL decomposition, Hybrid model, LSTM, ARIMA, XGBoost, Intelligent transportation systems
Fujiang Yuan, Yangrui Fan, Xiaohuan Bing, Zhen Tian, Chunhong Yuan, Yankang Li
TL;DR
Traffic flow forecasting for intelligent transportation systems is challenged by nonlinear, multi-scale temporal patterns. The authors introduce a decomposition-driven hybrid framework that uses STL to split the series into $Y_v = T_v + S_v + R_v$, with $T_v$ predicted by LSTM, $S_v$ by ARIMA, and $R_v$ by XGBoost, and a final forecast computed as $\hat{F} = \hat{T} \cdot \hat{S} \cdot \hat{R}$. An empirical evaluation on 998 traffic records from a NYC intersection demonstrates that the LSTM-ARIMA-XGBoost hybrid outperforms standalone baselines and variants across MAE, RMSE, and $R^2$, validating the decomposition approach. This work advances ITS forecasting by delivering improved accuracy, interpretability, and robustness, with a practical framework suitable for real-time urban traffic management.
Abstract
Accurate traffic flow forecasting is essential for intelligent transportation systems and urban traffic management. However, single model approaches often fail to capture the complex, nonlinear, and multi scale temporal patterns in traffic flow data. This study proposes a decomposition driven hybrid framework that integrates Seasonal Trend decomposition using Loess (STL) with three complementary predictive models. STL first decomposes the original time series into trend, seasonal, and residual components. Then, a Long Short Term Memory (LSTM) network models long term trends, an Autoregressive Integrated Moving Average (ARIMA) model captures seasonal periodicity, and an Extreme Gradient Boosting (XGBoost) algorithm predicts nonlinear residual fluctuations. The final forecast is obtained through multiplicative integration of the sub model predictions. Using 998 traffic flow records from a New York City intersection between November and December 2015, results show that the LSTM ARIMA XGBoost hybrid model significantly outperforms standalone models including LSTM, ARIMA, and XGBoost across MAE, RMSE, and R squared metrics. The decomposition strategy effectively isolates temporal characteristics, allowing each model to specialize, thereby improving prediction accuracy, interpretability, and robustness.
