Adaptive Thresholds for Monitoring and Screening in Imbalanced Samples: Optimality and Boosting Sensitivity
Ansgar Steland
TL;DR
The work addresses threshold-based monitoring in imbalanced populations by letting the alarm threshold depend on a covariate Z, thereby boosting detection for rare classes while controlling the overall false-alarm rate. It develops adaptive rules (notably the proportional and gamma-proportional rules), establishes sufficient optimality conditions, and provides nonparametric estimation procedures for unknown distribution functions, with a functional delta-method CLT and bootstrap for uncertainty quantification. The framework is extended to residual-based standardization, with a comprehensive theoretical treatment via Hadamard differentiability and residual empirical processes. Empirical illustration on a diabetes dataset and extensive simulations demonstrate improved minority-class sensitivity and reliable uncertainty quantification, highlighting practical value for screening and monitoring in heterogeneous populations.
Abstract
Suppose (standardized) measurements or statistics are monitored to raise an alarm when a threshold is exceeded. Often, the underlying population is heterogenous with respect to important discrete variables and thus samples may consist of imbalanced classes. We propose to use thresholds which depend on such covariates to boost the sensitivity for rare classes, which otherwise tend to be ignored. Under mild conditions, we identify optimal threshold functions and develop a feasible procedure for their computation. Further, for the proportional rule a nonparametric estimator of the threshold function is proposed and a central limit theorem is shown, including the case that conditional mean and variance used for standardization are estimated. For feasible uncertainty quantification a bootstrap scheme is proposed. The approach is illustrated and evaluated by a real data analysis.
