Beyond the Hype: Comparing Lightweight and Deep Learning Models for Air Quality Forecasting
Moazzam Umer Gondal, Hamad ul Qudous, Asma Ahmad Farhan
TL;DR
This study asks whether lightweight additive models can rival deep learning for short-term urban air-quality forecasting. By benchmarking Facebook Prophet and NeuralProphet against LSTM, SARIMAX, and LightGBM on Beijing PM$_{2.5}$ and PM$_{10}$ data, it demonstrates that FB Prophet achieves high predictive accuracy (test $R^2$ > 0.94) with strong interpretability and deployment practicality. The approach hinges on systematic feature selection, leakage-safe scaling, and realistic last-week forecasting, highlighting the enduring value of simple, transparent time-series models in policy-relevant applications. The findings advocate for operational adoption of additive models and point to future work in multi-source data integration and privacy-preserving collaboration across monitoring networks.
Abstract
Accurate forecasting of urban air pollution is essential for protecting public health and guiding mitigation policies. While Deep Learning (DL) and hybrid pipelines dominate recent research, their complexity and limited interpretability hinder operational use. This study investigates whether lightweight additive models -- Facebook Prophet (FBP) and NeuralProphet (NP) -- can deliver competitive forecasts for particulate matter (PM$_{2.5}$, PM$_{10}$) in Beijing, China. Using multi-year pollutant and meteorological data, we applied systematic feature selection (correlation, mutual information, mRMR), leakage-safe scaling, and chronological data splits. Both models were trained with pollutant and precursor regressors, with NP additionally leveraging lagged dependencies. For context, two machine learning baselines (LSTM, LightGBM) and one traditional statistical model (SARIMAX) were also implemented. Performance was evaluated on a 7-day holdout using MAE, RMSE, and $R^2$. Results show that FBP consistently outperformed NP, SARIMAX, and the learning-based baselines, achieving test $R^2$ above 0.94 for both pollutants. These findings demonstrate that interpretable additive models remain competitive with both traditional and complex approaches, offering a practical balance of accuracy, transparency, and ease of deployment.
