Table of Contents
Fetching ...

PCDCNet: A Surrogate Model for Air Quality Forecasting with Physical-Chemical Dynamics and Constraints

Shuo Wang, Yun Cheng, Qingye Meng, Olga Saukh, Jiang Zhang, Jingfang Fan, Yuanting Zhang, Xingyuan Yuan, Lothar Thiele

TL;DR

PCDCNet addresses the need for accurate, real-time air quality forecasting by integrating emissions data, meteorology, and physical-chemical constraints into a surrogate model that mimics CMAQ-like dynamics. The architecture combines Local Interaction Dynamics, Spatial Transport Dynamics, and Temporal Accumulation Dynamics with a Domain-Informed Constraint loss to enforce mass conservation and physical plausibility. It achieves state-of-the-art 72-hour station-level predictions for PM$_{2.5}$ and O$_3$, while significantly reducing computational cost and enabling online deployment. The framework demonstrates strong generalization and robustness across two major Chinese regions and diverse pollution events, offering a practical tool for personal protection, travel planning, and regulatory decision-making.

Abstract

Air quality forecasting (AQF) is critical for public health and environmental management, yet remains challenging due to the complex interplay of emissions, meteorology, and chemical transformations. Traditional numerical models, such as CMAQ and WRF-Chem, provide physically grounded simulations but are computationally expensive and rely on uncertain emission inventories. Deep learning models, while computationally efficient, often struggle with generalization due to their lack of physical constraints. To bridge this gap, we propose PCDCNet, a surrogate model that integrates numerical modeling principles with deep learning. PCDCNet explicitly incorporates emissions, meteorological influences, and domain-informed constraints to model pollutant formation, transport, and dissipation. By combining graph-based spatial transport modeling, recurrent structures for temporal accumulation, and representation enhancement for local interactions, PCDCNet achieves state-of-the-art (SOTA) performance in 72-hour station-level PM2.5 and O3 forecasting while significantly reducing computational costs. Furthermore, our model is deployed in an online platform, providing free, real-time air quality forecasts, demonstrating its scalability and societal impact. By aligning deep learning with physical consistency, PCDCNet offers a practical and interpretable solution for AQF, enabling informed decision-making for both personal and regulatory applications.

PCDCNet: A Surrogate Model for Air Quality Forecasting with Physical-Chemical Dynamics and Constraints

TL;DR

PCDCNet addresses the need for accurate, real-time air quality forecasting by integrating emissions data, meteorology, and physical-chemical constraints into a surrogate model that mimics CMAQ-like dynamics. The architecture combines Local Interaction Dynamics, Spatial Transport Dynamics, and Temporal Accumulation Dynamics with a Domain-Informed Constraint loss to enforce mass conservation and physical plausibility. It achieves state-of-the-art 72-hour station-level predictions for PM and O, while significantly reducing computational cost and enabling online deployment. The framework demonstrates strong generalization and robustness across two major Chinese regions and diverse pollution events, offering a practical tool for personal protection, travel planning, and regulatory decision-making.

Abstract

Air quality forecasting (AQF) is critical for public health and environmental management, yet remains challenging due to the complex interplay of emissions, meteorology, and chemical transformations. Traditional numerical models, such as CMAQ and WRF-Chem, provide physically grounded simulations but are computationally expensive and rely on uncertain emission inventories. Deep learning models, while computationally efficient, often struggle with generalization due to their lack of physical constraints. To bridge this gap, we propose PCDCNet, a surrogate model that integrates numerical modeling principles with deep learning. PCDCNet explicitly incorporates emissions, meteorological influences, and domain-informed constraints to model pollutant formation, transport, and dissipation. By combining graph-based spatial transport modeling, recurrent structures for temporal accumulation, and representation enhancement for local interactions, PCDCNet achieves state-of-the-art (SOTA) performance in 72-hour station-level PM2.5 and O3 forecasting while significantly reducing computational costs. Furthermore, our model is deployed in an online platform, providing free, real-time air quality forecasts, demonstrating its scalability and societal impact. By aligning deep learning with physical consistency, PCDCNet offers a practical and interpretable solution for AQF, enabling informed decision-making for both personal and regulatory applications.

Paper Structure

This paper contains 69 sections, 15 equations, 16 figures, 3 tables, 1 algorithm.

Figures (16)

  • Figure 1: Air quality ($\mathrm{PM}_{2.5}$ and $\mathrm{O}_{3}$) is shaped by complex interactions between meteorology (e.g., UV radiation, wind) and emissions (e.g., $\mathrm{NO}_x$, $\mathrm{VOC}$). Capturing these spatiotemporal dynamics, including pollutant transport and secondary formation, requires integrating emissions data with meteorology, which poses significant modeling challenges.
  • Figure 2: The framework of PCDCNet for air quality forecasting (AQF), comprising three stages: Data Pre-processing (graph construction and integration of $\mathbf{X}$, $\mathbf{P}$, $\mathbf{Q}$), Model Development (modules for temporal, spatial, and local dynamics with domain-informed constraints), and Model Deployment (real-time predictions via cloud-based Dockerized APIs).
  • Figure 3: MAE trends for $\mathrm{PM}_{2.5}$ in BTHSA and $\mathrm{O}_3$ in YRD over a 72-hour prediction horizon.
  • Figure 4: Analysis of Domain-Informed Constraints (DIC). Temporal and spatial DIC loss components confirm the necessity of separately modeling pollutant conservation across time and space. Stronger DIC constraints ($\lambda = 10$) improve test set performance while maintaining physical consistency.
  • Figure 5: Performance evaluation in BTHSA. (Left) Sensitivity analysis of hidden size (16, 32, 64) shows that 32 yields the lowest MAE, balancing complexity and generalization. (Right) Ablation study confirms performance drops when removing components, validating the model design.
  • ...and 11 more figures