Table of Contents
Fetching ...

Advancing Heatwave Forecasting via Distribution Informed-Graph Neural Networks (DI-GNNs): Integrating Extreme Value Theory with GNNs

Farrukh A. Chishtie, Dominique Brunet, Rachel H. White, Daniel Michelson, Jing Jiang, Vicky Lucas, Emily Ruboonga, Sayana Imaash, Melissa Westland, Timothy Chui, Rana Usman Ali, Mujtaba Hassan, Roland Stull, David Hudak

TL;DR

The Distribution-Informed Graph Neural Network (DI-GNN) is introduced, a novel framework that integrates principles from Extreme Value Theory (EVT) into the graph neural network architecture and achieves significant improvements in balanced accuracy, recall, and precision.

Abstract

Heatwaves, prolonged periods of extreme heat, have intensified in frequency and severity due to climate change, posing substantial risks to public health, ecosystems, and infrastructure. Despite advancements in Machine Learning (ML) modeling, accurate heatwave forecasting at weather scales (1--15 days) remains challenging due to the non-linear interactions between atmospheric drivers and the rarity of these extreme events. Traditional models relying on heuristic feature engineering often fail to generalize across diverse climates and capture the complexities of heatwave dynamics. This study introduces the Distribution-Informed Graph Neural Network (DI-GNN), a novel framework that integrates principles from Extreme Value Theory (EVT) into the graph neural network architecture. DI-GNN incorporates Generalized Pareto Distribution (GPD)-derived descriptors into the feature space, adjacency matrix, and loss function to enhance its sensitivity to rare heatwave occurrences. By prioritizing the tails of climatic distributions, DI-GNN addresses the limitations of existing methods, particularly in imbalanced datasets where traditional metrics like accuracy are misleading. Empirical evaluations using weather station data from British Columbia, Canada, demonstrate the superior performance of DI-GNN compared to baseline models. DI-GNN achieved significant improvements in balanced accuracy, recall, and precision, with high AUC and average precision scores, reflecting its robustness in distinguishing heatwave events.

Advancing Heatwave Forecasting via Distribution Informed-Graph Neural Networks (DI-GNNs): Integrating Extreme Value Theory with GNNs

TL;DR

The Distribution-Informed Graph Neural Network (DI-GNN) is introduced, a novel framework that integrates principles from Extreme Value Theory (EVT) into the graph neural network architecture and achieves significant improvements in balanced accuracy, recall, and precision.

Abstract

Heatwaves, prolonged periods of extreme heat, have intensified in frequency and severity due to climate change, posing substantial risks to public health, ecosystems, and infrastructure. Despite advancements in Machine Learning (ML) modeling, accurate heatwave forecasting at weather scales (1--15 days) remains challenging due to the non-linear interactions between atmospheric drivers and the rarity of these extreme events. Traditional models relying on heuristic feature engineering often fail to generalize across diverse climates and capture the complexities of heatwave dynamics. This study introduces the Distribution-Informed Graph Neural Network (DI-GNN), a novel framework that integrates principles from Extreme Value Theory (EVT) into the graph neural network architecture. DI-GNN incorporates Generalized Pareto Distribution (GPD)-derived descriptors into the feature space, adjacency matrix, and loss function to enhance its sensitivity to rare heatwave occurrences. By prioritizing the tails of climatic distributions, DI-GNN addresses the limitations of existing methods, particularly in imbalanced datasets where traditional metrics like accuracy are misleading. Empirical evaluations using weather station data from British Columbia, Canada, demonstrate the superior performance of DI-GNN compared to baseline models. DI-GNN achieved significant improvements in balanced accuracy, recall, and precision, with high AUC and average precision scores, reflecting its robustness in distinguishing heatwave events.

Paper Structure

This paper contains 25 sections, 21 equations, 17 figures, 3 tables.

Figures (17)

  • Figure 1: Spatial distribution of weather stations in British Columbia, Canada, used in the study. Data spans 71 stations from 2009 to 2024.
  • Figure 2: Performance metrics of the Li et al. (2023) model for heatwave prediction across 100 training epochs for the configuration $C_{\text{in}} = 10$ and $C_{\text{out}} = 3$. The metrics displayed include: Loss, Accuracy, Recall, Precision, Balanced Accuracy, and F1 Score, highlighting significant instability and low recall during training and validation phases.
  • Figure 3: Performance metrics of the Li et al. (2023) model for heatwave prediction across 100 training epochs for the configuration $C_{\text{in}} = 10$ and $C_{\text{out}} = 5$. The metrics displayed include: Loss, Accuracy, Recall, Precision, Balanced Accuracy, and F1 Score, highlighting significant instability and low recall during training and validation phases.
  • Figure 6: Performance metrics of the DI-GNN model for heatwave prediction across 100 training epochs for the configuration $C_{\text{in}} = 10$ and $C_{\text{out}} = 3$. Metrics displayed include: Loss, Accuracy, Recall, Precision, Balanced Accuracy, and F1 Score. DI-GNN demonstrates faster convergence, high recall, and consistent performance during training and validation phases.
  • Figure 7: Performance metrics of the DI-GNN model for heatwave prediction across 100 training epochs for the configuration $C_{\text{in}} = 10$ and $C_{\text{out}} = 5$. Metrics displayed include: Loss, Accuracy, Recall, Precision, Balanced Accuracy, and F1 Score. The DI-GNN model demonstrates stable performance with improved recall and precision compared to baseline methods.
  • ...and 12 more figures