Table of Contents
Fetching ...

On the optimal prediction of extreme events in heavy-tailed time series with applications to solar flare forecasting

Victor Verma, Stilian Stoev, Yang Chen

TL;DR

This work derives a Neyman–Pearson-type framework for optimal prediction of extreme events in heavy-tailed time series, showing that calibrated optimal predictors are governed by density ratios $r(X)=f_1(X)/f_0(X)$ and calibrations $F_{r(X)}^{\leftarrow}(q)$. It develops explicit closed-form predictors for additive and linear models, extends the theory to autoregressive and infinite-variance moving-average processes, and provides asymptotic results linking extremal precision to tail dependence coefficients. For practical inference in AR$(d)$ models, a plug-in approach based on robust coefficient estimation yields asymptotically calibrated and optimal predictors under mild regularity conditions; the asymptotic extremal-precision results are derived for MA$(\infty)$ via regular variation theory. The methodology is illustrated with solar-flare forecasting using GOES soft X-ray flux data, comparing baseline, AR, and FARIMA models to quantify fundamental limits and guide operational risk forecasting in a real-world, long-memory, heavy-tailed setting. Overall, the paper formalizes fundamental limits on predicting extremes in heavy-tailed time series and demonstrates how to approach such predictions in practice, with concrete implications for solar flare forecasting and related extreme-event tasks.

Abstract

The prediction of extreme events in time series is a fundamental problem arising in many financial, scientific, engineering, and other applications. We begin by establishing a general Neyman-Pearson-type characterization of optimal extreme event predictors in terms of density ratios. This yields new insights and several closed-form optimal extreme event predictors for additive models. These results naturally extend to time series, where we study optimal extreme event prediction for both light- and heavy-tailed autoregressive and moving average models. Using a uniform law of large numbers for ergodic time series, we establish the asymptotic optimality of an empirical version of the optimal predictor for autoregressive models. Using multivariate regular variation, we obtain an expression for the optimal extremal precision in heavy-tailed infinite moving averages, which provides theoretical bounds on the ability to predict extremes in this general class of models. We address the important problem of predicting solar flares by applying our theory and methodology to a state-of-the-art time series consisting of solar soft X-ray flux measurements. Our results demonstrate the success and limitations in solar flare forecasting of long-memory autoregressive models and long-range-dependent, heavy-tailed FARIMA models.

On the optimal prediction of extreme events in heavy-tailed time series with applications to solar flare forecasting

TL;DR

This work derives a Neyman–Pearson-type framework for optimal prediction of extreme events in heavy-tailed time series, showing that calibrated optimal predictors are governed by density ratios and calibrations . It develops explicit closed-form predictors for additive and linear models, extends the theory to autoregressive and infinite-variance moving-average processes, and provides asymptotic results linking extremal precision to tail dependence coefficients. For practical inference in AR models, a plug-in approach based on robust coefficient estimation yields asymptotically calibrated and optimal predictors under mild regularity conditions; the asymptotic extremal-precision results are derived for MA via regular variation theory. The methodology is illustrated with solar-flare forecasting using GOES soft X-ray flux data, comparing baseline, AR, and FARIMA models to quantify fundamental limits and guide operational risk forecasting in a real-world, long-memory, heavy-tailed setting. Overall, the paper formalizes fundamental limits on predicting extremes in heavy-tailed time series and demonstrates how to approach such predictions in practice, with concrete implications for solar flare forecasting and related extreme-event tasks.

Abstract

The prediction of extreme events in time series is a fundamental problem arising in many financial, scientific, engineering, and other applications. We begin by establishing a general Neyman-Pearson-type characterization of optimal extreme event predictors in terms of density ratios. This yields new insights and several closed-form optimal extreme event predictors for additive models. These results naturally extend to time series, where we study optimal extreme event prediction for both light- and heavy-tailed autoregressive and moving average models. Using a uniform law of large numbers for ergodic time series, we establish the asymptotic optimality of an empirical version of the optimal predictor for autoregressive models. Using multivariate regular variation, we obtain an expression for the optimal extremal precision in heavy-tailed infinite moving averages, which provides theoretical bounds on the ability to predict extremes in this general class of models. We address the important problem of predicting solar flares by applying our theory and methodology to a state-of-the-art time series consisting of solar soft X-ray flux measurements. Our results demonstrate the success and limitations in solar flare forecasting of long-memory autoregressive models and long-range-dependent, heavy-tailed FARIMA models.
Paper Structure (38 sections, 27 theorems, 239 equations, 6 figures, 5 tables, 1 algorithm)

This paper contains 38 sections, 27 theorems, 239 equations, 6 figures, 5 tables, 1 algorithm.

Key Result

Lemma 2.1

For any random variable $\xi$, we have: (i)$F_\xi^\leftarrow(F_\xi(\xi)) = \xi$, almost surely. (ii)$F_\xi(F_\xi^\leftarrow(q))= q$, for all $q\in \text{Range}(F_\xi)$, i.e.,

Figures (6)

  • Figure 1.1: Left: the GOES soft X-ray flux time series over the window 1 September 2011 to 31 May 2014. Right: the time series between 11:30 PM on 6 March 2014 and 1:00 AM on 7 March 2014, during which a strong solar flare occurred.
  • Figure 4.1: A plot of the combined soft X-ray flux time series from the maxima of solar cycles 23 and 24, which correspond to the date ranges 1 January 2000 to 31 December 2002 and 1 September 2011 to 31 May 2014, respectively. The horizontal lines correspond to the 0.9, 0.95, and 0.99 sample quantiles of the combined dataset. Note that the time series is plotted on a logarithmic scale.
  • Figure 4.2: Left: estimates of $\alpha$ for the FARIMA$(0, d, 0)$ models with symmetric $\alpha$-stable innovations that were fit over the training windows. Center: estimates of $d$ for the various models. Right: a scatterplot showing the relationship between $\widehat{\alpha}$ and $\widehat{d}$; for these models, $d$ must be less than $1 - 1 / \alpha$, hence the curve in the upper left.
  • Figure A.1: Results from the autoregressive model simulation study. Data was generated from a particular AR(5) model. One hundred training set-test set pairs were generated; the model was fit to the training set, and the fitted model was used to predict exceedances of high quantiles in the test set. The quantile levels used are on the horizontal axis. On each of the 100 runs, the precision was calculated from the test set predictions. The dashed orange line represents the asymptotic precision of the optimal predictor.
  • Figure A.2: Left: a contour plot showing how the one-step-ahead extremal optimal precision for the FARIMA(0, $d$, 0) model with symmetric $\alpha$-stable innovations \ref{['eq:farima_0_d_0_mod2']} varies as $\alpha$ and $d$ vary. Right: the extremal optimal precision as a function of $h$, for $(\alpha, d) = (1.4, 0.19)$, which is represented by the red dot in the left panel.
  • ...and 1 more figures

Theorems & Definitions (110)

  • Definition 2.1
  • Lemma 2.1
  • proof
  • Remark 2.1
  • Remark 2.2
  • Remark 2.3
  • Theorem 2.1
  • Remark 2.4
  • proof : Proof of Theorem \ref{['thm:base_thm']}
  • Remark 2.5
  • ...and 100 more