Censored extreme value estimation
Martin Bladt, Igor Rodionov
TL;DR
The paper develops a unified framework for censored extreme value analysis by marrying Kaplan--Meier survival methods with extreme value theory through extreme Kaplan--Meier integrals (EKMI). It provides a central decomposition that expresses EKMI as sums of (conditionally) i.i.d. terms plus a vanishing remainder, yielding consistency and asymptotic normality under regular variation, and it extends residual-based methods to all max-domains of attraction. By introducing generalized EKMI and generalized residual estimators, the authors derive censored versions of Hill-type and moment estimators for the extreme value index, including bias and second-order considerations, with extensive finite-sample validation via simulations and a brain cancer dataset. The work enables tail inference and tail-quantile estimation under random censoring, offering robust tools across Fréchet, Gumbel, and Weibull domains and opening paths to broader tail-characteristic estimation under censoring. Overall, the methodology enhances tail inference in censored data and provides practical, threshold-robust procedures for real-data applications.
Abstract
A novel and comprehensive methodology designed to tackle the challenges posed by extreme values in the context of random censorship is introduced. The main focus is on the analysis of integrals based on the product-limit estimator of normalized upper order statistics, called extreme Kaplan--Meier integrals. These integrals allow for the transparent derivation of various important asymptotic distributional properties, offering an alternative approach to conventional plug-in estimation methods. Notably, this methodology demonstrates robustness and wide applicability among various tail regimes. A noteworthy by-product is the extension of generalized Hill-type estimators of extremes to encompass arbitrary tail behavior, which is of independent interest. The theoretical framework is applied to construct novel estimators for real-valued extreme value indices for right-censored data. Simulation studies confirm the asymptotic results and, in a competitor case, mostly show superiority in mean square error. An application to brain cancer data demonstrates that censoring effects are properly accounted for, even when focusing solely on tail classification.
