Table of Contents
Fetching ...

Structural grouping of extreme value models via graph fused lasso

Takuma Yoshida, Koki Momoki, Shuichi Kawano

Abstract

The generalized Pareto distribution (GPD) is a fundamental model for analyzing the tail behavior of a distribution. In particular, the shape parameter of the GPD characterizes the extremal properties of the distribution. As described in this paper, we propose a method for grouping shape parameters in the GPD for clustered data via graph fused lasso. The proposed method simultaneously estimates the model parameters and identifies which clusters can be grouped together. We establish the asymptotic theory of the proposed estimator and demonstrate that its variance is lower than that of the cluster-wise estimator. This variance reduction not only enhances estimation stability but also provides a principled basis for identifying homogeneity and heterogeneity among clusters in terms of their tail behavior. We assess the performance of the proposed estimator through Monte Carlo simulations. As an illustrative example, our method is applied to rainfall data from 996 clustered sites across Japan.

Structural grouping of extreme value models via graph fused lasso

Abstract

The generalized Pareto distribution (GPD) is a fundamental model for analyzing the tail behavior of a distribution. In particular, the shape parameter of the GPD characterizes the extremal properties of the distribution. As described in this paper, we propose a method for grouping shape parameters in the GPD for clustered data via graph fused lasso. The proposed method simultaneously estimates the model parameters and identifies which clusters can be grouped together. We establish the asymptotic theory of the proposed estimator and demonstrate that its variance is lower than that of the cluster-wise estimator. This variance reduction not only enhances estimation stability but also provides a principled basis for identifying homogeneity and heterogeneity among clusters in terms of their tail behavior. We assess the performance of the proposed estimator through Monte Carlo simulations. As an illustrative example, our method is applied to rainfall data from 996 clustered sites across Japan.
Paper Structure (26 sections, 3 theorems, 105 equations, 19 figures, 1 table)

This paper contains 26 sections, 3 theorems, 105 equations, 19 figures, 1 table.

Key Result

Theorem 1

Suppose that (C1) and (C2), and $n_{\cal A}^{1/2}\max_{j\in{\cal A}}\alpha_j(w_j)\rightarrow 0$. Then, as $n\rightarrow \infty$, and Furthermore, for any $j\in{\cal A}$, $\{n_{{\cal A}}\}^{1/2} Cov(\hat{\gamma},\hat{\sigma}_j)\rightarrow 0$ as $n\rightarrow \infty$.

Figures (19)

  • Figure 1: Left: The cluster-wise estimator (black), the proposed estimator (red), and the true shape parameter (blue) for all clusters. The solid lines are medians. Dashed lines are lower/upper 2.5% quantiles. Right: ratio of MSE for all clusters $j=1,\ldots,J$.
  • Figure 2: Results for 95%-CI of return level for $j=1,\ldots,J$. Left: Coverage probabilities of CI from the proposed method (red) and the cluster-wise method (black). Right: Average of the ratio of length of CI of the proposed estimator over the cluster-wise estimator.
  • Figure 3: Locations of 996 sites of the main island of Japan.
  • Figure 4: Estimator of shape parameter for each site. Left: Cluster-wise estimator. Right: Proposed estimator.
  • Figure 5: Clusters grouped with selected clusters in Table 1.
  • ...and 14 more figures

Theorems & Definitions (6)

  • Theorem 1
  • Theorem 2
  • Lemma 1
  • proof : Proof of Lemma \ref{['Hessian']}
  • proof : Proof of Theorem \ref{['Oracle']}
  • proof : Proof of Theorem \ref{['MainTheorem']}