Conservative Decisions with Risk Scores
Yishu Wei, Wen-Yee Lee, George Ekow Quaye, Xiaogang Su
TL;DR
This work introduces a convex, SVM-inspired framework for conservative binary classification with abstention, by optimizing an optimal cutoff interval $(c-d, c+d)$ on risk scores to minimize width while maintaining specified coverage. It derives a theoretical solution at the population level showing that the Bayes threshold determines the interval center and that symmetry and a tunable penalty parameter $\gamma$ control the trade-off between accuracy and abstention. The approach yields a practical risk-coverage curve and AuRC metric, and is demonstrated via simulations and a prostate cancer diagnosis case, where a VOC-based risk score substantially outperforms PSA. By enabling robust abstention-based decisions across directly available or learned risk scores, the method offers a principled, scalable tool for safety-critical classification with explicit uncertainty handling.
Abstract
In binary classification applications, conservative decision-making that allows for abstention can be advantageous. To this end, we introduce a novel approach that determines the optimal cutoff interval for risk scores, which can be directly available or derived from fitted models. Within this interval, the algorithm refrains from making decisions, while outside the interval, classification accuracy is maximized. Our approach is inspired by support vector machines (SVM), but differs in that it minimizes the classification margin rather than maximizing it. We provide the theoretical optimal solution to this problem, which holds important practical implications. Our proposed method not only supports conservative decision-making but also inherently results in a risk-coverage curve. Together with the area under the curve (AUC), this curve can serve as a comprehensive performance metric for evaluating and comparing classifiers, akin to the receiver operating characteristic (ROC) curve. To investigate and illustrate our approach, we conduct both simulation studies and a real-world case study in the context of diagnosing prostate cancer.
