Regression modeling of multivariate precipitation extremes under regular variation
Rishikesh Yadav, Arnab Hazra
TL;DR
The paper addresses predicting extreme, multivariate precipitation events using a regression framework grounded in regular variation. It combines empirical estimation at sub-asymptotic levels with a theoretically justified link to tail scaling, implemented through a simple regression and extrapolation to very high thresholds, assuming independence across climate-model runs and no long-term trend. Bootstrap-based uncertainty quantification provides practical confidence intervals for extrapolated tail quantities, achieving competitive EVA2025 results. The approach offers a computationally efficient, interpretable tool for climate-risk assessment in high-dimensional extreme-value settings, while outlining directions for incorporating nonlinearities, dependence, and spatial covariates in future work.
Abstract
Motivated by the EVA2025 data challenge, where we participated as the team DesiBoys, we propose a regression strategy within the framework of regular variation to estimate the occurrences and intensities of high precipitation extremes derived from different climate runs of the CESM2 Large Ensemble Community Project (LENS2). Our approach first empirically estimates the target quantities at sub-asymptotic (lower threshold) levels and sets them as response variables within a simple regression framework arising from the theoretical expressions of joint regular variation. Although a seasonal pattern is evident in the data, the precipitation intensities do not exhibit any significant long-term trends across years. Besides, we can safely assume the data to be independent across different climate model runs, thereby simplifying the modeling framework. Once the regression parameters are estimated, we employ a standard prediction approach to infer precipitation levels at very high quantiles. We calculate the confidence intervals using a nonparametric block bootstrap procedure. While a likelihood-based inference grounded in multivariate extreme value theory may provide more accurate estimates and confidence intervals, it would involve a significantly higher computational burden. Our proposed simple and computationally straightforward two-stage approach provides reasonable estimates for the desired quantities, securing us a joint second position in the final rankings of the EVA2025 conference data challenge competition.
