Data-Driven Sequential Sampling for Tail Risk Mitigation
Dohyun Ahn, Taeho Kim
TL;DR
The paper tackles selecting the least tail-riskful among multiple heavy-tailed, distribution-unknown alternatives under a fixed sampling budget. It introduces TIRO, a data-driven sequential policy that uses tail-index estimators to prioritize sampling from the weakest tail (smallest $eta_i$) and pairs this with a rate-optimal allocation that maximizes the large-deviation decay rate ${ m G}(oldsymbol{ ho})$ of the probability of false selection. To address practical issues, the authors extend TIRO to I-TIRO with tie-handling via EVT-based estimators and adaptive tuning of the hyperparameter $oldsymbol{oldsymbol{ extdelta}}$, preserving asymptotic optimality. Numerical studies across Pareto, t, and Fréchet losses demonstrate that TIRO and especially I-TIRO outperform state-of-the-art baselines in extreme-risk, large-sample regimes, including scenarios with tail-index ties. The framework provides a principled, parametric-free approach to tail-risk ranking and offers avenues for extension to broader distribution classes and variance-reduction integrations.
Abstract
Given a finite collection of stochastic alternatives, we study the problem of sequentially allocating a fixed sampling budget to identify the optimal alternative with a high probability, where the optimal alternative is defined as the one with the smallest value of extreme tail risk. We particularly consider a situation where these alternatives generate heavy-tailed losses whose probability distributions are unknown and may not admit any specific parametric representation. In this setup, we propose data-driven sequential sampling policies that maximize the rate at which the likelihood of falsely selecting suboptimal alternatives decays to zero. We rigorously demonstrate the superiority of the proposed methods over existing approaches, which is further validated via numerical studies.
