Precision and Recall Reject Curves for Classification
Lydia Fischer, Patricia Wollstadt
TL;DR
The paper addresses the challenge of evaluating classifiers with reject options when precision or recall is the preferred performance metric, especially for imbalanced data. It introduces precision-reject curves (PRC) and recall-reject curves (RRC) and validates them using prototype-based classifiers (LVQ variants) with certainty measures (Conf, RelSim) and a Bayes baseline. Across artificial data, benchmark datasets, and medical data, PRC and RRC provide more meaningful, reliable insights than accuracy-based ARCs, particularly at higher acceptance rates, where ARCs can mislead. The work offers practical tools for deploying reliable, high-certainty predictions in safety-critical and imbalanced-domain applications, with future work targeting multi-class extensions and additional evaluation metrics.
Abstract
For some classification scenarios, it is desirable to use only those classification instances that a trained model associates with a high certainty. To obtain such high-certainty instances, previous work has proposed accuracy-reject curves. Reject curves allow to evaluate and compare the performance of different certainty measures over a range of thresholds for accepting or rejecting classifications. However, the accuracy may not be the most suited evaluation metric for all applications, and instead precision or recall may be preferable. This is the case, for example, for data with imbalanced class distributions. We therefore propose reject curves that evaluate precision and recall, the recall-reject curve and the precision-reject curve. Using prototype-based classifiers from learning vector quantization, we first validate the proposed curves on artificial benchmark data against the accuracy reject curve as a baseline. We then show on imbalanced benchmarks and medical, real-world data that for these scenarios, the proposed precision- and recall-curves yield more accurate insights into classifier performance than accuracy reject curves.
