Applications of machine learning and IoT for Outdoor Air Pollution Monitoring and Prediction: A Systematic Literature Review
Ihsane Gryech, Chaimae Assad, Mounir Ghogho, Abdellatif Kobbane
TL;DR
This systematic review consolidates evidence on how machine learning and IoT technologies are applied to outdoor air pollution monitoring and forecasting, categorizing studies by monitoring cost (low-cost, high-cost, hybrid) and by prediction approach (time series, feature-based, spatio-temporal). It finds that hybrid setups often deliver the best trade-off between coverage and accuracy, with rich input features including meteorology, traffic, land use, and context-specific factors; yet gaps remain in geographic diversity and feature breadth. The review highlights representative models (e.g., LSTM, XGBoost, RF, GC-DCRN) and showcases how mobile and fixed sensors can be fused to produce accurate, high-resolution air quality maps and forecasts. Practical implications span health, urban planning, and smart city initiatives, while future work should emphasize chemical modeling integration, global data diversity, and spatio-temporal forecasting at scale.
Abstract
According to the World Health Organization (WHO), air pollution kills seven million people every year. Outdoor air pollution is a major environmental health problem affecting low, middle, and high-income countries. In the past few years, the research community has explored IoT-enabled machine learning applications for outdoor air pollution prediction. The general objective of this paper is to systematically review applications of machine learning and Internet of Things (IoT) for outdoor air pollution prediction and the combination of monitoring sensors and input features used. Two research questions were formulated for this review. 1086 publications were collected in the initial PRISMA stage. After the screening and eligibility phases, 37 papers were selected for inclusion. A cost-based analysis was conducted on the findings to highlight high-cost monitoring, low-cost IoT and hybrid enabled prediction. Three methods of prediction were identified: time series, feature-based and spatio-temporal. This review's findings identify major limitations in applications found in the literature, namely lack of coverage, lack of diversity of data and lack of inclusion of context-specific features. This review proposes directions for future research and underlines practical implications in healthcare, urban planning, global synergy and smart cities.
