Table of Contents
Fetching ...

Real-time Air Pollution prediction model based on Spatiotemporal Big data

Van-Duc Le, Tien-Cuong Bui, Sang Kyun Cha

TL;DR

The paper tackles real-time urban air pollution prediction by leveraging spatiotemporal big data collected from taxi-mounted sensors in Daegu. It combines a CNN that converts city-wide pollution into 32×32 grayscale spatial images for real-time classification with a hybrid temporal predictor (LSTM for time series plus NN for weather factors) to forecast future pollution levels. Key contributions include the 32×32 image representation, the CNN-based spatial predictor achieving ~74% accuracy, and the LSTM+NN hybrid that improves temporal forecasts compared to baselines, demonstrated on a large, real-time dataset preprocessed with Spark and implemented in TensorFlow. This approach enables scalable, real-time monitoring and could inform urban air quality management across cities.

Abstract

Air pollution is one of the most concerns for urban areas. Many countries have constructed monitoring stations to hourly collect pollution values. Recently, there is a research in Daegu city, Korea for real-time air quality monitoring via sensors installed on taxis running across the whole city. The collected data is huge (1-second interval) and in both Spatial and Temporal format. In this paper, based on this spatiotemporal Big data, we propose a real-time air pollution prediction model based on Convolutional Neural Network (CNN) algorithm for image-like Spatial distribution of air pollution. Regarding to Temporal information in the data, we introduce a combination of a Long Short-Term Memory (LSTM) unit for time series data and a Neural Network model for other air pollution impact factors such as weather conditions to build a hybrid prediction model. This model is simple in architecture but still brings good prediction ability.

Real-time Air Pollution prediction model based on Spatiotemporal Big data

TL;DR

The paper tackles real-time urban air pollution prediction by leveraging spatiotemporal big data collected from taxi-mounted sensors in Daegu. It combines a CNN that converts city-wide pollution into 32×32 grayscale spatial images for real-time classification with a hybrid temporal predictor (LSTM for time series plus NN for weather factors) to forecast future pollution levels. Key contributions include the 32×32 image representation, the CNN-based spatial predictor achieving ~74% accuracy, and the LSTM+NN hybrid that improves temporal forecasts compared to baselines, demonstrated on a large, real-time dataset preprocessed with Spark and implemented in TensorFlow. This approach enables scalable, real-time monitoring and could inform urban air quality management across cities.

Abstract

Air pollution is one of the most concerns for urban areas. Many countries have constructed monitoring stations to hourly collect pollution values. Recently, there is a research in Daegu city, Korea for real-time air quality monitoring via sensors installed on taxis running across the whole city. The collected data is huge (1-second interval) and in both Spatial and Temporal format. In this paper, based on this spatiotemporal Big data, we propose a real-time air pollution prediction model based on Convolutional Neural Network (CNN) algorithm for image-like Spatial distribution of air pollution. Regarding to Temporal information in the data, we introduce a combination of a Long Short-Term Memory (LSTM) unit for time series data and a Neural Network model for other air pollution impact factors such as weather conditions to build a hybrid prediction model. This model is simple in architecture but still brings good prediction ability.

Paper Structure

This paper contains 9 sections, 2 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: "Image-like" spatial distribution of PM2.5 air pollution values in Daegu city, Korea. $y = 2$ means the air pollution is Unhealthy, $y = 1$ means the pollution is Moderate, $y = 0$ means outdoor air is Good for health. The index $i$ indicates different time stamps.
  • Figure 2: Monitoring device information and Web UI of air quality data collection in Daegu city. Pictures are taken from [8].
  • Figure 3: CNN architecture for real-time air pollution classification in Daegu.
  • Figure 4: The hybrid model of LSTM units and a neural network model.
  • Figure 5: Prediction values with the testing set (01/2018). Prediction = 0 means outdoor air is Good (for health).