Table of Contents
Fetching ...

Scoring Rules with Normalized Upper Order Statistics for Tail Inference

Martin Bladt, Christoffer Øhlenschlæger

Abstract

This paper proposes a scoring-rule-based method for ranking predictive distributions in the Fréchet domain that is able to distinguish between different tail indices. The approach is built on normalized order statistics and exploits proper scoring rules to compare tail limit distributions in a distributional framework, with direct relevance for insurance claim-severity tails. On the theoretical side, consistency and asymptotic normality for empirical tail scores based on normalized upper order statistics are obtained through residual estimation theory. Simulation results demonstrate that the scoring-rule-based approach is capable of discriminating between different tail behaviors in finite samples and that trends in the scaling have only a minor impact on stability. We further show that optimizing scoring rules (equivalently, minimizing the associated loss form) yields consistent tail-index estimators and that the classical Hill estimator arises as a special case. The performance of the proposed method is investigated and compared with the Hill estimator across a range of tail indices. Lastly, we analyze an automobile claim-severity data set to demonstrate how scoring rules can be used to rank predictive models based on tail predictions in actuarial settings.

Scoring Rules with Normalized Upper Order Statistics for Tail Inference

Abstract

This paper proposes a scoring-rule-based method for ranking predictive distributions in the Fréchet domain that is able to distinguish between different tail indices. The approach is built on normalized order statistics and exploits proper scoring rules to compare tail limit distributions in a distributional framework, with direct relevance for insurance claim-severity tails. On the theoretical side, consistency and asymptotic normality for empirical tail scores based on normalized upper order statistics are obtained through residual estimation theory. Simulation results demonstrate that the scoring-rule-based approach is capable of discriminating between different tail behaviors in finite samples and that trends in the scaling have only a minor impact on stability. We further show that optimizing scoring rules (equivalently, minimizing the associated loss form) yields consistent tail-index estimators and that the classical Hill estimator arises as a special case. The performance of the proposed method is investigated and compared with the Hill estimator across a range of tail indices. Lastly, we analyze an automobile claim-severity data set to demonstrate how scoring rules can be used to rank predictive models based on tail predictions in actuarial settings.
Paper Structure (14 sections, 7 theorems, 57 equations, 8 figures, 1 table)

This paper contains 14 sections, 7 theorems, 57 equations, 8 figures, 1 table.

Key Result

Theorem 2

Let Assumption ass:segers_ass hold. In addition, assume that $k,n \to \infty$ and $k/n \to 0$. Let denote the order statistics of the sample. Then in probability, where $Y^\circ$ has distribution $G^\circ$.

Figures (8)

  • Figure 1: Empirical logarithmic scores in \ref{['eq: estimator']} (vertical axis) plotted against the number of upper order statistics $k$ (horizontal axis) for candidate tail indices $\gamma \in \{0.8, 1, 1.2, 1.5\}$. Left panels use Fréchet data-generating distributions, right panels use Burr distributions, and rows correspond to $n=10^3,10^4,10^5$. Higher curves indicate better tail fit; the true value is $\gamma_G=1$.
  • Figure 2: Proportion of simulations (based on 100 Monte Carlo replications) in which the empirical logarithmic score in \ref{['eq: estimator']} is maximized at the true tail index $\gamma=1$ among candidate values $\gamma \in \{0.8, 1, 1.2, 1.5\}$, plotted against the relative number of upper order statistics $k/n$. The left plot uses Fréchet data-generating distributions and the right plot uses Burr distributions. Curves correspond to sample sizes $n=10^3,10^4,10^5$. Higher values indicate that the logarithmic score more frequently identifies the true tail index.
  • Figure 3: Empirical logarithmic scores (vertical axis) versus $k$ (horizontal axis) for Fréchet baseline samples with heterogeneous scaling, $Y_i=X_iZ_i$, where $Z_i$ has tail index $\gamma_G=1$. Candidate tail indices are $\gamma \in \{0.8, 1, 1.2, 1.5\}$. Left panels use $X_i^1=i/n$, right panels use $X_i^2=1.5+0.5\sin(6\pi i/n)$, and rows correspond to $n=10^3,10^4,10^5$.
  • Figure 4: Empirical logarithmic scores (vertical axis) versus $k$ (horizontal axis) for Burr baseline samples with heterogeneous scaling, $Y_i=X_iZ_i$, where $Z_i$ has tail index $\gamma_G=1$. Candidate tail indices are $\gamma \in \{0.8, 1, 1.2, 1.5\}$. Left panels use $X_i^1=i/n$, right panels use $X_i^2=1.5+0.5\sin(6\pi i/n)$, and rows correspond to $n=10^3,10^4,10^5$.
  • Figure 5: Proportion of simulations (based on 100 Monte Carlo replications) in which the empirical logarithmic score in \ref{['eq: estimator']} is maximized at the true tail index $\gamma=1$ among candidate values $\gamma \in \{0.8, 1, 1.2, 1.5\}$, plotted against the relative number of upper order statistics $k/n$. The top panels correspond to Fréchet data-generating distributions and the bottom panels to Burr distributions. The left panels use a linearly varying scale, while the right panels use a sinusoidally varying scale. Curves correspond to sample sizes $n=10^3,10^4,10^5$. Higher values indicate that the logarithmic score more frequently identifies the true tail index.
  • ...and 3 more figures

Theorems & Definitions (18)

  • Definition 1: $\mathcal{P}$-quasi-integrable
  • Definition 2: Scoring rule
  • Definition 3: Proper scoring rule
  • Theorem 2
  • Theorem 3
  • Corollary 4
  • proof
  • Remark 5
  • Theorem 6
  • Lemma 7
  • ...and 8 more