Learning extremal graphical structures in high dimensions
Sebastian Engelke, Michaël Lalancette, Stanislav Volgushev
TL;DR
This work develops a principled framework for learning extremal graphical structures in high dimensions. It introduces non-asymptotic concentration bounds for empirical extremal variograms and a general EGlearn majority-voting algorithm to recover arbitrary extremal graphs under Hüsler–Reiss models, with consistency guarantees under explicit conditions. The methodology is validated through simulations on Barabási–Albert and block-graph models and applied to hydrological and financial data, yielding interpretable tail-dependence graphs that align with domain knowledge. The results enable scalable, high-dimensional inference for tail dependencies and offer avenues for uncertainty quantification and extension to broader variogram estimators and base learners.
Abstract
Extremal graphical models encode the conditional independence structure of multivariate extremes. Key statistics for learning extremal graphical structures are empirical extremal variograms, for which we prove non-asymptotic concentration bounds that hold under general domain of attraction conditions. For the popular class of Hüsler--Reiss models, we propose a majority voting algorithm for learning the underlying graph from data through $L^1$ regularized optimization. Our concentration bounds are used to derive explicit conditions that ensure the consistent recovery of any connected graph. The methodology is illustrated through a simulation study as well as the analysis of river discharge and currency exchange data.
