Table of Contents
Fetching ...

Regression modelling of spatiotemporal extreme U.S. wildfires via partially-interpretable neural networks

Jordan Richards, Raphaël Huser

TL;DR

This work addresses the challenge of estimating and interpreting extreme-value quantiles for spatiotemporal environmental processes, focusing on U.S. wildfires. It introduces partially-interpretable neural networks (PINNs) that combine interpretable linear/spline components with neural-network-based nonparametric parts to model extreme-value regression within a point-process framework. The key contributions include the bGEV-PP model, a convolutional PINN architecture for tails, and a thorough data-driven analysis showing improved tail predictions and interpretable drivers of wildfire extremes, along with hazard maps and temporal trends. The framework is scalable to high-dimensional predictors and provides actionable insights into where and when extreme wildfires are most likely and how meteorological drivers shape their severity.

Abstract

Risk management in many environmental settings requires an understanding of the mechanisms that drive extreme events. Useful metrics for quantifying such risk are extreme quantiles of response variables conditioned on predictor variables that describe, e.g., climate, biosphere and environmental states. Typically these quantiles lie outside the range of observable data and so, for estimation, require specification of parametric extreme value models within a regression framework. Classical approaches in this context utilise linear or additive relationships between predictor and response variables and suffer in either their predictive capabilities or computational efficiency; moreover, their simplicity is unlikely to capture the truly complex structures that lead to the creation of extreme wildfires. In this paper, we propose a new methodological framework for performing extreme quantile regression using artificial neutral networks, which are able to capture complex non-linear relationships and scale well to high-dimensional data. The "black box" nature of neural networks means that they lack the desirable trait of interpretability often favoured by practitioners; thus, we unify linear, and additive, regression methodology with deep learning to create partially-interpretable neural networks that can be used for statistical inference but retain high prediction accuracy. To complement this methodology, we further propose a novel point process model for extreme values which overcomes the finite lower-endpoint problem associated with the generalised extreme value class of distributions. Efficacy of our unified framework is illustrated on U.S. wildfire data with a high-dimensional predictor set and we illustrate vast improvements in predictive performance over linear and spline-based regression techniques.

Regression modelling of spatiotemporal extreme U.S. wildfires via partially-interpretable neural networks

TL;DR

This work addresses the challenge of estimating and interpreting extreme-value quantiles for spatiotemporal environmental processes, focusing on U.S. wildfires. It introduces partially-interpretable neural networks (PINNs) that combine interpretable linear/spline components with neural-network-based nonparametric parts to model extreme-value regression within a point-process framework. The key contributions include the bGEV-PP model, a convolutional PINN architecture for tails, and a thorough data-driven analysis showing improved tail predictions and interpretable drivers of wildfire extremes, along with hazard maps and temporal trends. The framework is scalable to high-dimensional predictors and provides actionable insights into where and when extreme wildfires are most likely and how meteorological drivers shape their severity.

Abstract

Risk management in many environmental settings requires an understanding of the mechanisms that drive extreme events. Useful metrics for quantifying such risk are extreme quantiles of response variables conditioned on predictor variables that describe, e.g., climate, biosphere and environmental states. Typically these quantiles lie outside the range of observable data and so, for estimation, require specification of parametric extreme value models within a regression framework. Classical approaches in this context utilise linear or additive relationships between predictor and response variables and suffer in either their predictive capabilities or computational efficiency; moreover, their simplicity is unlikely to capture the truly complex structures that lead to the creation of extreme wildfires. In this paper, we propose a new methodological framework for performing extreme quantile regression using artificial neutral networks, which are able to capture complex non-linear relationships and scale well to high-dimensional data. The "black box" nature of neural networks means that they lack the desirable trait of interpretability often favoured by practitioners; thus, we unify linear, and additive, regression methodology with deep learning to create partially-interpretable neural networks that can be used for statistical inference but retain high prediction accuracy. To complement this methodology, we further propose a novel point process model for extreme values which overcomes the finite lower-endpoint problem associated with the generalised extreme value class of distributions. Efficacy of our unified framework is illustrated on U.S. wildfire data with a high-dimensional predictor set and we illustrate vast improvements in predictive performance over linear and spline-based regression techniques.
Paper Structure (38 sections, 16 equations, 16 figures, 4 tables)

This paper contains 38 sections, 16 equations, 16 figures, 4 tables.

Figures (16)

  • Figure 1: Maps of observed $\log\{1+\sqrt{Y}(s,t)\}$ (log-$\sqrt{\hbox{acres}}$; top-left), 2m air temperature (K; top-right), 3-month SPI (unitless; bottom-left), and proportion of grassland coverage (unitless; bottom-right) for July, 2007.
  • Figure 2: Functional boxplots of estimates of the additive function contributions of 2m air temperature (K; top row) and 3-month SPI (unitless; bottom row) to the log-odds of occurrence probability $p_0(s,t)$ (unitless; left column), log-location $\log \{q_\alpha(s,t)\}$ (log-$\sqrt{\hbox{acres}}$; centre column), and log-scale $\log \{s_\beta(s,t)\}$ (log-$\sqrt{\hbox{acres}}$; right column) of the global PINN model. Black curves gives the median function over all bootstrap samples, and are enclosed within the 50$\%$ central regions (magenta). Blue curves denote the maximum envelopes, and the red dashed curves represent outlier candidates that fall outside these envelopes. Red horizontal lines denote a zero effect, whilst the red triangles denote the placement of the splines' knots.
  • Figure 3: Maps of median estimated $m_\mathcal{I}(\cdot)$ for the local PINN model, where $m_\mathcal{I}(\cdot)$ is a linear model with state-varying coefficients. Mapped values correspond to the contribution of 2m air temperature (K; left column) and 3-month SPI (unitless; right column) to the log-odds of occurrence probability $p_0(s,t)$ (unitless; top row), log-location $\log \{q_\alpha(s,t)\}$ (log-$\sqrt{\hbox{acres}}$; centre row), and log-scale $\log \{s_\beta(s,t)\}$ (log-$\sqrt{\hbox{acres}}$; bottom row). Shaded regions correspond to insignificant estimates, where the $95\%$ bootstrap confidence interval includes zero.
  • Figure 4: Maps of observed $\log\{1+\sqrt{Y}(s,t)\}$ (log-$\sqrt{\hbox{acres}}$; top-left) and median estimated $p_0(s,t):=\Pr\{Y(s,t)>0\mid\mathbf{X}(s,t)\}$ (unitless; top-right), $q_\alpha(s,t)$ ($\sqrt{\hbox{acres}}$; centre-left), $\log \{1+s_\beta(s,t)\}$ (log-$\sqrt{\hbox{acres}}$; centre-right), $\xi(s)$ (unitless; bottom-left), and $90\%$ quantile of $\log\{1+ \sqrt{Y}(s,t)\}\mid\{Y(s,t)>0,\mathbf{X}(s,t)\}$ (log-$\sqrt{\hbox{acres}}$; bottom-right) for July, 2007.
  • Figure 5: Maps of median estimated quantiles for $\log\{1+\sqrt{Y}(s,t)\}\mid\mathbf{X}(s,t)$ (log-$\sqrt{\hbox{acres}}$) for July 2007. Quantiles, left to right: $90\%$, $95\%$, and $99\%$.
  • ...and 11 more figures