Spatial Transfer Learning for Estimating PM2.5 in Data-poor Regions

Shrey Gupta; Yongbee Park; Jianzhao Bi; Suyash Gupta; Andreas Züfle; Avani Wildani; Yang Liu

Spatial Transfer Learning for Estimating PM2.5 in Data-poor Regions

Shrey Gupta, Yongbee Park, Jianzhao Bi, Suyash Gupta, Andreas Züfle, Avani Wildani, Yang Liu

TL;DR

This work generates LDF using a novel two-stage autoencoder model that learns from clusters of similar source and target domain data and shows that transfer learning models using LDF have a 19.34% improvement over the baselines.

Abstract

Air pollution, especially particulate matter 2.5 (PM2.5), is a pressing concern for public health and is difficult to estimate in developing countries (data-poor regions) due to a lack of ground sensors. Transfer learning models can be leveraged to solve this problem, as they use alternate data sources to gain knowledge (i.e., data from data-rich regions). However, current transfer learning methodologies do not account for dependencies between the source and the target domains. We recognize this transfer problem as spatial transfer learning and propose a new feature named Latent Dependency Factor (LDF) that captures spatial and semantic dependencies of both domains and is subsequently added to the feature spaces of the domains. We generate LDF using a novel two-stage autoencoder model that learns from clusters of similar source and target domain data. Our experiments show that transfer learning models using LDF have a 19.34% improvement over the baselines. We additionally support our experiments with qualitative findings.

Spatial Transfer Learning for Estimating PM2.5 in Data-poor Regions

TL;DR

Abstract

Paper Structure (32 sections, 2 equations, 5 figures, 3 tables)

This paper contains 32 sections, 2 equations, 5 figures, 3 tables.

Introduction
Related Work
Estimating PM$_{2.5}$ via Transfer Learning
Transfer Learning via Feature Augmentation
Problem Formulation
Methodology
Neighborhood Cloud Generation
Generating Latent Dependency Factor (LDF)
Encoder-decoder Stage
Encoder-estimator Stage
Transfer Learning and Multivariate Regression
Evaluation
Datasets
United States dataset.
Lima dataset
...and 17 more sections

Figures (5)

Figure 1: Framework for spatial transfer learning via Latent Dependency Factor
Figure 2: Two-stage autoencoder model for generating LDF.
Figure 3: (a) US PM$_{2.5}$ ground sensors. The points in the pink target region represent sample training (green) and testing (red) sensors. The green and yellow regions represent the eastern and north-eastern source regions, respectively. (b) PM$_{2.5}$ sensors in Lima, Peru. Red points represent sensors used for training, and the grey area represents satellite data for testing.
Figure 4: (a) Annual mean PM$_{2.5}$ prediction for California-Nevada, trained using GBR and NNW with and without LDF features (9 sensors). (b) Annual mean PM$_{2.5}$ prediction for Lima region trained using NNW models.
Figure 5: (a) Comparing performance of NNW [LDF] model when neighborhood cloud uses k = {4, 8, 12, 16} neighbors. (b) Ablation study comparing the performance of GBR, GBR [LDF], GBR [LDF-A], NNW, and NNW [LDF] models.

Spatial Transfer Learning for Estimating PM2.5 in Data-poor Regions

TL;DR

Abstract

Spatial Transfer Learning for Estimating PM2.5 in Data-poor Regions

Authors

TL;DR

Abstract

Table of Contents

Figures (5)