Table of Contents
Fetching ...

Cross-Resolution Attention Network for High-Resolution PM2.5 Prediction

Ammar Kheder, Helmi Toropainen, Wenqing Peng, Samuel Antão, Zhi-Song Liu, Michael Boy

Abstract

Vision Transformers have achieved remarkable success in spatio-temporal prediction, but their scalability remains limited for ultra-high-resolution, continent-scale domains required in real-world environmental monitoring. A single European air-quality map at 1 km resolution comprises 29 million pixels, far beyond the limits of naive self-attention. We introduce CRAN-PM, a dual-branch Vision Transformer that leverages cross-resolution attention to efficiently fuse global meteorological data (25 km) with local high-resolution PM2.5 at the current time (1 km). Instead of including physically driven factors like temperature and topography as input, we further introduce elevation-aware self-attention and wind-guided cross-attention to force the network to learn physically consistent feature representations for PM2.5 forecasting. CRAN-PM is fully trainable and memory-efficient, generating the complete 29-million-pixel European map in 1.8 seconds on a single GPU. Evaluated on daily PM2.5 forecasting throughout Europe in 2022 (362 days, 2,971 European Environment Agency (EEA) stations), it reduces RMSE by 4.7% at T+1 and 10.7% at T+3 compared to the best single-scale baseline, while reducing bias in complex terrain by 36%.

Cross-Resolution Attention Network for High-Resolution PM2.5 Prediction

Abstract

Vision Transformers have achieved remarkable success in spatio-temporal prediction, but their scalability remains limited for ultra-high-resolution, continent-scale domains required in real-world environmental monitoring. A single European air-quality map at 1 km resolution comprises 29 million pixels, far beyond the limits of naive self-attention. We introduce CRAN-PM, a dual-branch Vision Transformer that leverages cross-resolution attention to efficiently fuse global meteorological data (25 km) with local high-resolution PM2.5 at the current time (1 km). Instead of including physically driven factors like temperature and topography as input, we further introduce elevation-aware self-attention and wind-guided cross-attention to force the network to learn physically consistent feature representations for PM2.5 forecasting. CRAN-PM is fully trainable and memory-efficient, generating the complete 29-million-pixel European map in 1.8 seconds on a single GPU. Evaluated on daily PM2.5 forecasting throughout Europe in 2022 (362 days, 2,971 European Environment Agency (EEA) stations), it reduces RMSE by 4.7% at T+1 and 10.7% at T+3 compared to the best single-scale baseline, while reducing bias in complex terrain by 36%.
Paper Structure (26 sections, 6 equations, 10 figures, 7 tables, 1 algorithm)

This paper contains 26 sections, 6 equations, 10 figures, 7 tables, 1 algorithm.

Figures (10)

  • Figure 1: CRAN-PM predicts daily PM$_{2.5}$ at 1 km resolution across Europe. (a) Full European prediction for January 25, 2022 (T+1 horizon). Red rectangles indicate zoom regions. (b--e) Regional details for Po Valley, Paris, Malmö, and Silesia. Colored circles represent independent EEA ground station measurements. Annual T+1 performance: RMSE = 5.99 µ g/m$^3$, $r$ = 0.74.
  • Figure 2: Architecture of CRAN-PM. A global branch (top) encodes coarse meteorological fields with wind-guided token reordering and elevation-aware attention; a local branch (bottom) encodes high-resolution PM$_{2.5}$ subimages. Two wind-biased cross-attention layers fuse the branches (fine queries coarse). A PixelShuffle-based Upblock (red inset) reconstructs the residual. The yellow inset details elevation-aware attention.
  • Figure 3: Europe-wide PM$_{2.5}$ evaluation (2022). (a) GT (GHAP, 1 km), Jan. 25, 2022. (b) CRAN-PM T+1; inset: Po Valley. (c,d) RMSE across T+1--T+3 at 1 km and 25 km. CRAN-PM (red stars) consistently outperforms all baselines.
  • Figure 4: Temporal PM$_{2.5}$ evolution (2022, T+1). Blue: GHAP; red: CRAN-PM. Six regions. CRAN-PM achieves lowest RMSE in all regions.
  • Figure 5: Regional PM$_{2.5}$ comparison at 1 km (T+1). Rows: Po Valley, Paris, Silesia, Rhine-Ruhr, London. (a) GT, (b) CAMS, (c) ClimaX, (d) CRAN-PM, (e) error (SSIM $\geq$ 0.63).
  • ...and 5 more figures