Lightweight ML-Based Air Quality Prediction for IoT and Embedded Applications
Md. Sad Abdullah Sami, Mushfiquzzaman Abid
TL;DR
The paper investigates lightweight versus full XGBoost configurations for predicting CO and NO2 using the AirQualityUCI dataset, balancing predictive accuracy with edge-deployment constraints. It introduces a resource-aware framework, evaluates standard regression metrics alongside model size, inference time, and RAM usage, and finds that the full model is more accurate while the tiny model offers substantial efficiency gains suitable for embedded IoT contexts. The key contribution is quantifying the trade-offs between accuracy and resource demands to guide TinyML-informed deployment in urban air quality monitoring. The results support deploying simplified models on constrained devices without severely sacrificing forecast quality, enabling real-time, on-device air quality sensing. The study also outlines limitations and directions for validating across diverse contexts and in real-time streaming settings.
Abstract
This study investigates the effectiveness and efficiency of two variants of the XGBoost regression model, the full-capacity and lightweight (tiny) versions, for predicting the concentrations of carbon monoxide (CO) and nitrogen dioxide (NO2). Using the AirQualityUCI dataset collected over one year in an urban environment, we conducted a comprehensive evaluation based on widely accepted metrics, including Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Mean Bias Error (MBE), and the coefficient of determination (R2). In addition, we assessed resource-oriented metrics such as inference time, model size, and peak RAM usage. The full XGBoost model achieved superior predictive accuracy for both pollutants, while the tiny model, though slightly less precise, offered substantial computational benefits with significantly reduced inference time and model storage requirements. These results demonstrate the feasibility of deploying simplified models in resource-constrained environments without compromising predictive quality. This makes the tiny XGBoost model suitable for real-time air-quality monitoring in IoT and embedded applications.
