Table of Contents
Fetching ...

Conditional Extremes with Graphical Models

Aiden Farrell, Emma F. Eastoe, Clement Lee

TL;DR

This work extends conditional multivariate extreme value analysis to data on graphs by introducing the MVAGG residual distribution and a structured, graphical CMEVM (SCMEVM). By combining asymmetric generalised Gaussian margins with a Gaussian copula and enforcing sparsity through graph-based precision matrices, the method accommodates both asymptotic independence and dependence, enabling accurate high-dimensional predictions and structure learning. A stepwise inference framework preserves information while dramatically improving scalability, and graphical structure learning via graphical lasso enables data-driven sparsity. Application to the upper Danube river basin demonstrates improved tail-dependence predictions and highlightsAI/AD mixtures that are not well captured by traditional AD-only models, indicating strong practical utility for risk assessment on river networks and similar infrastructures.

Abstract

Multivariate extreme value analysis quantifies the probability and magnitude of joint extreme events. River discharges from the upper Danube River basin provide a challenging dataset for such analysis because the data, which is measured on a spatial network, exhibits both asymptotic dependence and asymptotic independence. To account for both features, we extend the conditional multivariate extreme value model (CMEVM) with a new approach for the residual distribution. This allows sparse (graphical) dependence structures and fully parametric prediction. Our approach fills a current gap in statistical methodology by extending graphical extremes models to asymptotically independent random variables. Further, the model can be used to learn the graphical dependence structure when it is unknown a priori. To support inference in high dimensions, we propose a stepwise inference procedure that is computationally efficient and loses no information or predictive power. We show our method is flexible and accurately captures the extremal dependence for the upper Danube River basin discharges.

Conditional Extremes with Graphical Models

TL;DR

This work extends conditional multivariate extreme value analysis to data on graphs by introducing the MVAGG residual distribution and a structured, graphical CMEVM (SCMEVM). By combining asymmetric generalised Gaussian margins with a Gaussian copula and enforcing sparsity through graph-based precision matrices, the method accommodates both asymptotic independence and dependence, enabling accurate high-dimensional predictions and structure learning. A stepwise inference framework preserves information while dramatically improving scalability, and graphical structure learning via graphical lasso enables data-driven sparsity. Application to the upper Danube river basin demonstrates improved tail-dependence predictions and highlightsAI/AD mixtures that are not well captured by traditional AD-only models, indicating strong practical utility for risk assessment on river networks and similar infrastructures.

Abstract

Multivariate extreme value analysis quantifies the probability and magnitude of joint extreme events. River discharges from the upper Danube River basin provide a challenging dataset for such analysis because the data, which is measured on a spatial network, exhibits both asymptotic dependence and asymptotic independence. To account for both features, we extend the conditional multivariate extreme value model (CMEVM) with a new approach for the residual distribution. This allows sparse (graphical) dependence structures and fully parametric prediction. Our approach fills a current gap in statistical methodology by extending graphical extremes models to asymptotically independent random variables. Further, the model can be used to learn the graphical dependence structure when it is unknown a priori. To support inference in high dimensions, we propose a stepwise inference procedure that is computationally efficient and loses no information or predictive power. We show our method is flexible and accurately captures the extremal dependence for the upper Danube River basin discharges.

Paper Structure

This paper contains 37 sections, 15 equations, 33 figures, 4 tables, 6 algorithms.

Figures (33)

  • Figure 1: Undirected tree induced by the flow connections of the upper Danube River basin (left) with sites 16, 19 and 29 in blue. Scatter plots on standard Fréchet margins (centre) and empirical estimates of $\eta(u)$ (right) for $u \in (0,1]$ for sites 19 and 29 (top) and 16 and 29 (right).
  • Figure 2: Boxplots detailing the bias of $\hat{\alpha}_{j \mid i}$ for distinct $i, j \in V$. Each row corresponds to the conditioning variable $i$, and each column corresponds to the sample size. The fill of the boxplots denotes the different models. The red dashed line indicates the $y = 0$ line.
  • Figure 3: True underlying graphical structure (left) and the inferred graphical structure (right), with line width and darkness indicating the number of times each edge was selected across 100 samples.
  • Figure 4: Boxplots of empirical and model-based estimates of $\Gamma_{\mid i}$, for each $i \in V$, when the data is generated from a mixture distribution. Each row corresponds to the conditioning variable $i$, and each column corresponds to the correlation parameter. The colour of the boxplots distinguishes the different models. The black dashed line indicates the $y = 0$ line.
  • Figure 5: Boxplots of the bias in $p_1=\mathbb{P}[X_{1} > v_{1}, X_{2} > v_{2} \mid X_{3} > u_{3}]$ (left) and $p_2=\mathbb{P}[X_{3} > v_{3}, X_{4} > v_{4} \mid X_{5} > u_{5}]$ (right). The fill of the boxplots distinguishes the bias from the different models. The black dashed line indicates the $y = 0$ line.
  • ...and 28 more figures