Table of Contents
Fetching ...

Discovering Spatial Correlations of Earth Observations for weather forecasting by using Graph Structure Learning

Hyeon-Ju Jeon, Jeon-Ho Kang, In-Hyuk Kwon, O-Joun Lee

TL;DR

This work addresses the limitations of fixed graph structures in numerical weather prediction by introducing CloudNine-v2, a spatiotemporal graph neural network with adaptive, distance-aware structure learning to discover dynamic spatial correlations between NWP grid points and heterogeneous Earth observations. It constructs region-specific adjacencies via a differentiable Gumbel-Softmax over edge scores that combine feature similarity and spatial distance, with a continuous neighborhood size $k_{i,t}$ that adapts per node. A subgraph sampling strategy ($k$-hop, $k=3$) enables scalable integration of multi-source data into a STGNN with a GNN encoder and GRU decoder, while pretraining on node-feature reconstruction helps facilitate learning across heterogeneous inputs. Empirical results on Korean Peninsula data show substantial improvements (up to 15% RMSE reduction) and enhanced robustness in highly variable regions, validating the approach’s ability to capture dynamic spatial relationships and localized atmospheric variability, with strong implications for multi-source data assimilation in weather forecasting.

Abstract

This study aims to improve the accuracy of weather predictions by discovering spatial correlations between Earth observations and atmospheric states. Existing numerical weather prediction (NWP) systems predict future atmospheric states at fixed locations, which are called NWP grid points, by analyzing previous atmospheric states and newly acquired Earth observations. However, the shifting locations of observations and the surrounding meteorological context induce complex, dynamic spatial correlations that are difficult for traditional NWP systems to capture, since they rely on strict statistical and physical formulations. To handle complicated spatial correlations, which change dynamically, we employ a spatiotemporal graph neural networks (STGNNs) with structure learning. However, structure learning has an inherent limitation that this can cause structural information loss and over-smoothing problem by generating excessive edges. To solve this problem, we regulate edge sampling by adaptively determining node degrees and considering the spatial distances between NWP grid points and observations. We validated the effectiveness of the proposed method (CloudNine-v2) using real-world atmospheric state and observation data from East Asia, achieving up to 15\% reductions in RMSE over existing STGNN models. Even in areas with high atmospheric variability, CloudNine-v2 consistently outperformed baselines with and without structure learning.

Discovering Spatial Correlations of Earth Observations for weather forecasting by using Graph Structure Learning

TL;DR

This work addresses the limitations of fixed graph structures in numerical weather prediction by introducing CloudNine-v2, a spatiotemporal graph neural network with adaptive, distance-aware structure learning to discover dynamic spatial correlations between NWP grid points and heterogeneous Earth observations. It constructs region-specific adjacencies via a differentiable Gumbel-Softmax over edge scores that combine feature similarity and spatial distance, with a continuous neighborhood size that adapts per node. A subgraph sampling strategy (-hop, ) enables scalable integration of multi-source data into a STGNN with a GNN encoder and GRU decoder, while pretraining on node-feature reconstruction helps facilitate learning across heterogeneous inputs. Empirical results on Korean Peninsula data show substantial improvements (up to 15% RMSE reduction) and enhanced robustness in highly variable regions, validating the approach’s ability to capture dynamic spatial relationships and localized atmospheric variability, with strong implications for multi-source data assimilation in weather forecasting.

Abstract

This study aims to improve the accuracy of weather predictions by discovering spatial correlations between Earth observations and atmospheric states. Existing numerical weather prediction (NWP) systems predict future atmospheric states at fixed locations, which are called NWP grid points, by analyzing previous atmospheric states and newly acquired Earth observations. However, the shifting locations of observations and the surrounding meteorological context induce complex, dynamic spatial correlations that are difficult for traditional NWP systems to capture, since they rely on strict statistical and physical formulations. To handle complicated spatial correlations, which change dynamically, we employ a spatiotemporal graph neural networks (STGNNs) with structure learning. However, structure learning has an inherent limitation that this can cause structural information loss and over-smoothing problem by generating excessive edges. To solve this problem, we regulate edge sampling by adaptively determining node degrees and considering the spatial distances between NWP grid points and observations. We validated the effectiveness of the proposed method (CloudNine-v2) using real-world atmospheric state and observation data from East Asia, achieving up to 15\% reductions in RMSE over existing STGNN models. Even in areas with high atmospheric variability, CloudNine-v2 consistently outperformed baselines with and without structure learning.

Paper Structure

This paper contains 16 sections, 11 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Regional variability and its impact on atmospheric state estimation errors. Areas with higher variability (black eclipses) tend to show larger RMSE values.
  • Figure 2: An illustration of the overall process of CloudNine-v2. The model consists of three stages: (1) extracting features in both NWP grid points and multi-source observations, (2) adaptive graph structure learning, and (3) spatial and temporal feature extraction for weather forecasting.
  • Figure 3: Node-level prediction accuracy for four meteorological variables, classified by low and high temporal variability groups.
  • Figure 4: Sensitivity analysis of (left) the Gumbel-softmax temperature ($\tau$), and (right) the hidden dimension size, evaluated in terms of mean $R^2$ (solid line) and standard deviation (shaded area).