Robust Tail Index Estimation under Random Censoring via Minimum Density Power Divergence
Nour Elhouda Guesmia, Abdelhakim Necir, Djamel Meraghni
TL;DR
This work develops a robust tail-index estimator for Pareto-type tails under random right censoring by extending the minimum density power divergence (MDPD) framework to censored extreme value models. The proposed estimator $\hat{\gamma}_{1,\alpha}$ minimizes a Nelson–Aalen–based MDPD objective, achieving consistency under first-order regular variation with $p>1/2$ and asymptotic normality under a second-order framework with appropriate scaling. Through extensive simulations, the authors demonstrate strong robustness to contamination in the upper tail, with a tunable parameter $\alpha$ balancing efficiency and robustness, and they compare favorably against classical censored-tail estimators. Real-data applications to insurance losses and AIDS survival times illustrate practical benefits and limitations, notably under weak versus strong censoring, and motivate future extensions to handle extreme censoring regimes and covariates. Overall, the paper provides a principled, robust alternative for tail inference in censored data with clear theoretical guarantees and demonstrated empirical performance.
Abstract
We propose a robust estimator for the tail index of Pareto-type distributions under random right-censoring, based on the minimum density power divergence (MDPD) framework. To our knowledge, this is the first application of the MDPD approach to extreme value models with random censoring, opening a new direction for robust inference in this setting. Under mild regularity conditions, the estimator is shown to be consistent and asymptotically normal. Its performance in finite samples is extensively evaluated through simulation studies, demonstrating superior robustness and efficiency compared to existing methods. Contamination is introduced only before censoring to provide a meaningful assessment of robustness, while contamination after censoring is shown to yield distorted or unrealistic results. The practical relevance of the approach is illustrated using a real dataset on insurance claims, which features light censoring and fully observable extremes, and a dataset on AIDS survival times, which, despite stronger censoring (p<1/2), allows for illustrative comparisons and highlights practical limitations and challenges in more difficult scenarios.
