Table of Contents
Fetching ...

Conservative Decisions with Risk Scores

Yishu Wei, Wen-Yee Lee, George Ekow Quaye, Xiaogang Su

TL;DR

This work introduces a convex, SVM-inspired framework for conservative binary classification with abstention, by optimizing an optimal cutoff interval $(c-d, c+d)$ on risk scores to minimize width while maintaining specified coverage. It derives a theoretical solution at the population level showing that the Bayes threshold determines the interval center and that symmetry and a tunable penalty parameter $\gamma$ control the trade-off between accuracy and abstention. The approach yields a practical risk-coverage curve and AuRC metric, and is demonstrated via simulations and a prostate cancer diagnosis case, where a VOC-based risk score substantially outperforms PSA. By enabling robust abstention-based decisions across directly available or learned risk scores, the method offers a principled, scalable tool for safety-critical classification with explicit uncertainty handling.

Abstract

In binary classification applications, conservative decision-making that allows for abstention can be advantageous. To this end, we introduce a novel approach that determines the optimal cutoff interval for risk scores, which can be directly available or derived from fitted models. Within this interval, the algorithm refrains from making decisions, while outside the interval, classification accuracy is maximized. Our approach is inspired by support vector machines (SVM), but differs in that it minimizes the classification margin rather than maximizing it. We provide the theoretical optimal solution to this problem, which holds important practical implications. Our proposed method not only supports conservative decision-making but also inherently results in a risk-coverage curve. Together with the area under the curve (AUC), this curve can serve as a comprehensive performance metric for evaluating and comparing classifiers, akin to the receiver operating characteristic (ROC) curve. To investigate and illustrate our approach, we conduct both simulation studies and a real-world case study in the context of diagnosing prostate cancer.

Conservative Decisions with Risk Scores

TL;DR

This work introduces a convex, SVM-inspired framework for conservative binary classification with abstention, by optimizing an optimal cutoff interval on risk scores to minimize width while maintaining specified coverage. It derives a theoretical solution at the population level showing that the Bayes threshold determines the interval center and that symmetry and a tunable penalty parameter control the trade-off between accuracy and abstention. The approach yields a practical risk-coverage curve and AuRC metric, and is demonstrated via simulations and a prostate cancer diagnosis case, where a VOC-based risk score substantially outperforms PSA. By enabling robust abstention-based decisions across directly available or learned risk scores, the method offers a principled, scalable tool for safety-critical classification with explicit uncertainty handling.

Abstract

In binary classification applications, conservative decision-making that allows for abstention can be advantageous. To this end, we introduce a novel approach that determines the optimal cutoff interval for risk scores, which can be directly available or derived from fitted models. Within this interval, the algorithm refrains from making decisions, while outside the interval, classification accuracy is maximized. Our approach is inspired by support vector machines (SVM), but differs in that it minimizes the classification margin rather than maximizing it. We provide the theoretical optimal solution to this problem, which holds important practical implications. Our proposed method not only supports conservative decision-making but also inherently results in a risk-coverage curve. Together with the area under the curve (AUC), this curve can serve as a comprehensive performance metric for evaluating and comparing classifiers, akin to the receiver operating characteristic (ROC) curve. To investigate and illustrate our approach, we conduct both simulation studies and a real-world case study in the context of diagnosing prostate cancer.

Paper Structure

This paper contains 11 sections, 3 theorems, 27 equations, 7 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

Assume that $\pi(r) = \Pr(y = +1 | r)$ is a monotone increasing function of the risk score $r$. Define $c^*$ as the value of $r$ such that $\pi(c^\star) = 0.5$ and $d^\star$ as the threshold such that $\Pr( |r-c^\star| > d^\star ) = \theta.$ Additionally, we assume that $\pi(r)$ satisfies the follow Then $(c^\star, d^\star)$ is an optimal solution of (opt-population).

Figures (7)

  • Figure 1: Illustration of SVM (a) and conservative decision with hard (b) and soft (c) margins based on risk scores. Positive cases are denoted as solid dots while negative cases are denoted as circles.
  • Figure 2: Ideal separation case without (a) and with (b) noise, examined in Section \ref{['sec-study1']}. The results are based on the median values of $c$ and $d$ estimates, as well as the coverage and accuracy, obtained from 200 simulation runs.
  • Figure 3: Boxplots of AuRC Values from Study 2 in Section \ref{['sec-study2']}. For each setting, AuRC from training samples are plotted first, followed by AuRC values from test samples.
  • Figure 4: Comparing classifiers through the risk-coverage (RC) curve and the ROC curve: (a) Averaged risk-coverage curve and ROC curve over 200 simulation runs; (b) Parallel boxplots of the area under the RC and ROC curves.
  • Figure 5: Simulated Data for Study 3 in Section \ref{['sec-study3']}.
  • ...and 2 more figures

Theorems & Definitions (3)

  • Theorem 1
  • Lemma 1
  • Corollary 1