Real-time Air Pollution prediction model based on Spatiotemporal Big data
Van-Duc Le, Tien-Cuong Bui, Sang Kyun Cha
TL;DR
The paper tackles real-time urban air pollution prediction by leveraging spatiotemporal big data collected from taxi-mounted sensors in Daegu. It combines a CNN that converts city-wide pollution into 32×32 grayscale spatial images for real-time classification with a hybrid temporal predictor (LSTM for time series plus NN for weather factors) to forecast future pollution levels. Key contributions include the 32×32 image representation, the CNN-based spatial predictor achieving ~74% accuracy, and the LSTM+NN hybrid that improves temporal forecasts compared to baselines, demonstrated on a large, real-time dataset preprocessed with Spark and implemented in TensorFlow. This approach enables scalable, real-time monitoring and could inform urban air quality management across cities.
Abstract
Air pollution is one of the most concerns for urban areas. Many countries have constructed monitoring stations to hourly collect pollution values. Recently, there is a research in Daegu city, Korea for real-time air quality monitoring via sensors installed on taxis running across the whole city. The collected data is huge (1-second interval) and in both Spatial and Temporal format. In this paper, based on this spatiotemporal Big data, we propose a real-time air pollution prediction model based on Convolutional Neural Network (CNN) algorithm for image-like Spatial distribution of air pollution. Regarding to Temporal information in the data, we introduce a combination of a Long Short-Term Memory (LSTM) unit for time series data and a Neural Network model for other air pollution impact factors such as weather conditions to build a hybrid prediction model. This model is simple in architecture but still brings good prediction ability.
