Table of Contents
Fetching ...

The VOROS: Lifting ROC curves to 3D

Christopher Ratigan, Lenore Cowen

TL;DR

This work introduces VOROS, the Volume over the ROC Surface, a 3D generalization of the AUROC that accounts for misclassification costs and class imbalance by lifting the traditional ROC curve to a ROC surface. The authors formalize the ROC surface and a cost-based area function, derive a computable VOROS measure, and show it accordingly aligns with minimum expected cost while handling ranges of costs via a measure over t. They demonstrate substantial value across benchmark datasets (Wisconsin Breast Cancer, BUSI) and a credit fraud dataset, illustrating that VOROS can yield cost-aware classifier rankings when costs are uncertain or imbalanced, beyond what AUROC provides. The approach is efficient (O(n log n)) and adaptable to bounded cost scenarios, with discussion of limitations to binary tasks and avenues for generalization to multi-class settings and instance-level costs.

Abstract

While the area under the ROC curve is perhaps the most common measure that is used to rank the relative performance of different binary classifiers, longstanding field folklore has noted that it can be a measure that ill-captures the benefits of different classifiers when either the actual class values or misclassification costs are highly unbalanced between the two classes. We introduce a new ROC surface, and the VOROS, a volume over this ROC surface, as a natural way to capture these costs, by lifting the ROC curve to 3D. Compared to previous attempts to generalize the ROC curve, our formulation also provides a simple and intuitive way to model the scenario when only ranges, rather than exact values, are known for possible class imbalance and misclassification costs.

The VOROS: Lifting ROC curves to 3D

TL;DR

This work introduces VOROS, the Volume over the ROC Surface, a 3D generalization of the AUROC that accounts for misclassification costs and class imbalance by lifting the traditional ROC curve to a ROC surface. The authors formalize the ROC surface and a cost-based area function, derive a computable VOROS measure, and show it accordingly aligns with minimum expected cost while handling ranges of costs via a measure over t. They demonstrate substantial value across benchmark datasets (Wisconsin Breast Cancer, BUSI) and a credit fraud dataset, illustrating that VOROS can yield cost-aware classifier rankings when costs are uncertain or imbalanced, beyond what AUROC provides. The approach is efficient (O(n log n)) and adaptable to bounded cost scenarios, with discussion of limitations to binary tasks and avenues for generalization to multi-class settings and instance-level costs.

Abstract

While the area under the ROC curve is perhaps the most common measure that is used to rank the relative performance of different binary classifiers, longstanding field folklore has noted that it can be a measure that ill-captures the benefits of different classifiers when either the actual class values or misclassification costs are highly unbalanced between the two classes. We introduce a new ROC surface, and the VOROS, a volume over this ROC surface, as a natural way to capture these costs, by lifting the ROC curve to 3D. Compared to previous attempts to generalize the ROC curve, our formulation also provides a simple and intuitive way to model the scenario when only ranges, rather than exact values, are known for possible class imbalance and misclassification costs.
Paper Structure (7 sections, 7 theorems, 24 equations, 7 figures, 5 tables)

This paper contains 7 sections, 7 theorems, 24 equations, 7 figures, 5 tables.

Key Result

Lemma 8

The Upper Convex Hull of an ROC curve dominates the ROC curve.

Figures (7)

  • Figure 1: The area of Lesser Classifiers for the point $(h,k)$ lies below the iso-performance line with slope $m=\frac{t}{1-t}$.
  • Figure 2: The two possibilities for the area of lesser classifiers for the better of the baselines.
  • Figure 3: (Left) Graph of $A_t(\{(0,0),(1,1)\})$. (Right) The Volume bounded by these areas.
  • Figure 4: (a) A plot of the ROC curves for the Wisconsin Breast Cancer dataset. (b) The VOROS for the Logistic Regression classifier. (c) The VOROS for the Bayes classifier. (d) The VOROS for the Random Forest classifier.
  • Figure 5: (a) A plot of the ROC curves for the BUSI dataset. (b) The VOROS for the Logistic Regression classifier. (c) The VOROS for the Bayes classifier. (d) The VOROS for the Random Forest classifier.
  • ...and 2 more figures

Theorems & Definitions (31)

  • Definition 1
  • Definition 2: ROC Space
  • Definition 3
  • Definition 4
  • Definition 5
  • Definition 6
  • Definition 7
  • Lemma 8
  • proof
  • Definition 9
  • ...and 21 more