Table of Contents
Fetching ...

Regression modeling of multivariate precipitation extremes under regular variation

Rishikesh Yadav, Arnab Hazra

TL;DR

The paper addresses predicting extreme, multivariate precipitation events using a regression framework grounded in regular variation. It combines empirical estimation at sub-asymptotic levels with a theoretically justified link to tail scaling, implemented through a simple regression and extrapolation to very high thresholds, assuming independence across climate-model runs and no long-term trend. Bootstrap-based uncertainty quantification provides practical confidence intervals for extrapolated tail quantities, achieving competitive EVA2025 results. The approach offers a computationally efficient, interpretable tool for climate-risk assessment in high-dimensional extreme-value settings, while outlining directions for incorporating nonlinearities, dependence, and spatial covariates in future work.

Abstract

Motivated by the EVA2025 data challenge, where we participated as the team DesiBoys, we propose a regression strategy within the framework of regular variation to estimate the occurrences and intensities of high precipitation extremes derived from different climate runs of the CESM2 Large Ensemble Community Project (LENS2). Our approach first empirically estimates the target quantities at sub-asymptotic (lower threshold) levels and sets them as response variables within a simple regression framework arising from the theoretical expressions of joint regular variation. Although a seasonal pattern is evident in the data, the precipitation intensities do not exhibit any significant long-term trends across years. Besides, we can safely assume the data to be independent across different climate model runs, thereby simplifying the modeling framework. Once the regression parameters are estimated, we employ a standard prediction approach to infer precipitation levels at very high quantiles. We calculate the confidence intervals using a nonparametric block bootstrap procedure. While a likelihood-based inference grounded in multivariate extreme value theory may provide more accurate estimates and confidence intervals, it would involve a significantly higher computational burden. Our proposed simple and computationally straightforward two-stage approach provides reasonable estimates for the desired quantities, securing us a joint second position in the final rankings of the EVA2025 conference data challenge competition.

Regression modeling of multivariate precipitation extremes under regular variation

TL;DR

The paper addresses predicting extreme, multivariate precipitation events using a regression framework grounded in regular variation. It combines empirical estimation at sub-asymptotic levels with a theoretically justified link to tail scaling, implemented through a simple regression and extrapolation to very high thresholds, assuming independence across climate-model runs and no long-term trend. Bootstrap-based uncertainty quantification provides practical confidence intervals for extrapolated tail quantities, achieving competitive EVA2025 results. The approach offers a computationally efficient, interpretable tool for climate-risk assessment in high-dimensional extreme-value settings, while outlining directions for incorporating nonlinearities, dependence, and spatial covariates in future work.

Abstract

Motivated by the EVA2025 data challenge, where we participated as the team DesiBoys, we propose a regression strategy within the framework of regular variation to estimate the occurrences and intensities of high precipitation extremes derived from different climate runs of the CESM2 Large Ensemble Community Project (LENS2). Our approach first empirically estimates the target quantities at sub-asymptotic (lower threshold) levels and sets them as response variables within a simple regression framework arising from the theoretical expressions of joint regular variation. Although a seasonal pattern is evident in the data, the precipitation intensities do not exhibit any significant long-term trends across years. Besides, we can safely assume the data to be independent across different climate model runs, thereby simplifying the modeling framework. Once the regression parameters are estimated, we employ a standard prediction approach to infer precipitation levels at very high quantiles. We calculate the confidence intervals using a nonparametric block bootstrap procedure. While a likelihood-based inference grounded in multivariate extreme value theory may provide more accurate estimates and confidence intervals, it would involve a significantly higher computational burden. Our proposed simple and computationally straightforward two-stage approach provides reasonable estimates for the desired quantities, securing us a joint second position in the final rankings of the EVA2025 conference data challenge competition.
Paper Structure (12 sections, 22 equations, 3 figures, 1 table)

This paper contains 12 sections, 22 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Left: Estimated slope components (blue dots) of the GEV location parameter fitted to yearly maxima, with 95% confidence intervals (vertical bar). Right: Daily average precipitation across four runs with LOWESS smoothing.
  • Figure 2: The relationship between thresholds, $u_l$, and empirical counts, $(Z_l)$, for Task 1 (first rows), Task 2 (second rows), and Task 3 (third row), shown across different transformations (column-wise). Corresponding $R^2$ values are also shown for respective regression fits to evaluate the goodness of fit.
  • Figure 3: Histograms for the bootstrap samples for Task 1 (left), Task 2 (middle), and Task 3 (right) quantities.