State estimation of urban air pollution with statistical, physical, and super-learning graph models
Matthieu Dolbeault, Olga Mula, Agustín Somacal
TL;DR
This work tackles real-time state estimation of urban NO$_2$ concentrations by modeling the city as a metric/quantum graph and fusing heterogeneous data sources (sensor measurements, meteorology, and traffic-derived emissions). It develops and compares a spectrum of reconstruction methods—spatial average, BLUE, kriging, source-emission models, and physics-based elliptic diffusion on graphs—then couples them into an ensemble super-learning framework to improve accuracy. Reduced-order and physics-informed approaches are used to manage computational cost while capturing key spatial dynamics, with validation on Paris data and a leave-one-out cross-validation strategy to mitigate limited sensor coverage. The ensemble method achieves robust performance across stations, highlighting the value of integrating data-driven and physics-driven models for real-time urban pollution mapping and emphasizing data quality and topography as avenues for future gains.
Abstract
We consider the problem of real-time reconstruction of urban air pollution maps. The task is challenging due to the heterogeneous sources of available data, the scarcity of direct measurements, the presence of noise, and the large surfaces that need to be considered. In this work, we introduce different reconstruction methods based on posing the problem on city graphs. Our strategies can be classified as fully data-driven, physics-driven, or hybrid, and we combine them with super-learning models. The performance of the methods is tested in the case of the inner city of Paris, France.
