Table of Contents
Fetching ...

Impossibility results for equating the Youden Index with average scoring rules and Tjur $R^2$-like metrics

Linard Hoessly

TL;DR

This work investigates whether the Youden index, a classic diagnostic accuracy measure, can be represented as either the average of a real-valued scoring rule over predicted probabilities or as a Tjur $R^2$-like metric derived from probabilistic predictions. By formalizing the data setup with a binary outcome model, a $2\times2$ contingency framework, and general scoring rules $S$, the authors establish impossibility results: no continuous scoring rule $S$ yields $Youden=Av_S$ or $Youden=Ev_S$ for all feasible contingency configurations. The proofs, conducted via contradiction and case analysis on $(a,b,c,d)$, reveal fundamental obstructions to such equivalences, underscoring the distinct roles of these metrics in diagnostic assessment. The findings invite further exploration of alternative links between classification evaluation and probabilistic prediction measures, suggesting that new or different metrics may be required to bridge these perspectives.

Abstract

We consider the Youden index fas well as measures evaluating predicted probabilities for the maximum-likelihood estimate of a logistic regression model with predictor the classifier. We give impossibility results showing that the Youden index can not equal any average of a real scoring rule nor any metric averaging over binary outcomes (0s and 1s) for any continuous real-valued scoring rule. This shows the obstructions of such potential equivalences and highlights the distinct roles these metrics play in diagnostic assessment.

Impossibility results for equating the Youden Index with average scoring rules and Tjur $R^2$-like metrics

TL;DR

This work investigates whether the Youden index, a classic diagnostic accuracy measure, can be represented as either the average of a real-valued scoring rule over predicted probabilities or as a Tjur -like metric derived from probabilistic predictions. By formalizing the data setup with a binary outcome model, a contingency framework, and general scoring rules , the authors establish impossibility results: no continuous scoring rule yields or for all feasible contingency configurations. The proofs, conducted via contradiction and case analysis on , reveal fundamental obstructions to such equivalences, underscoring the distinct roles of these metrics in diagnostic assessment. The findings invite further exploration of alternative links between classification evaluation and probabilistic prediction measures, suggesting that new or different metrics may be required to bridge these perspectives.

Abstract

We consider the Youden index fas well as measures evaluating predicted probabilities for the maximum-likelihood estimate of a logistic regression model with predictor the classifier. We give impossibility results showing that the Youden index can not equal any average of a real scoring rule nor any metric averaging over binary outcomes (0s and 1s) for any continuous real-valued scoring rule. This shows the obstructions of such potential equivalences and highlights the distinct roles these metrics play in diagnostic assessment.

Paper Structure

This paper contains 4 sections, 2 theorems, 29 equations.

Key Result

Theorem 3.1

There is no real-valued scoring rule $S(\cdot,\cdot)$ for binary outcomes such that its average in gen_score_applied equals the Youden index Youden2 for all $(a,b,c,d)\in\mathcal{A}$.

Theorems & Definitions (6)

  • Remark 2.1
  • Remark 2.2
  • Theorem 3.1
  • proof
  • Theorem 3.2
  • proof