On Rank Graduation Metrics for High Dimensional Ordinal Data
Gennaro Auricchio, Adelaide Emma Bernardelli, Paolo Giudici, Giuseppe Toscani
TL;DR
The paper addresses evaluating reliability for ordinal targets by introducing RGX_p metrics, a unifying, rank-based framework that quantifies the portion of variability explained by predictions. It develops a solid theoretical bridge between RGX_p, CvM divergences, and Lorenz/Gini concepts, and extends to multivariate settings via a whitening approach. Through extensive experiments on ESG scores with linear and neural models, it demonstrates improved accuracy, robustness, and explainability under the RGX_p framework, supported by Shapley-based feature attribution and Spearman rankings. The work provides a principled pathway for trustworthy, SAFE AI in domains with ordinal outcomes and complex multivariate structure.
Abstract
Evaluating the reliability of machine learning classifications remains a fundamental challenge in Artificial Intelligence (AI), particularly when the target variable is multidimensional. Classification variables can be expressed by means of a categorical scale which, at best, is ordinal. Because ordinal data lack a natural metric structure in their underlying space, most conventional distance measures aimed at assessing the accuracy of machine learning classifications cannot be directly or meaningfully applied. In this paper, we develop a mathematical framework for comparing ordinal data based on a family of Rank Graduation $(\mathrm{RGX}_p)$ \emph{metrics}. We demonstrate that these metrics can quantify the proportion of variability of the response explained by the predictions, in a similar manner as the predictive $R^2$ for continuous response variables. After establishing theoretical connections between the $\mathrm{RGX}_p$ family and other prominent metrics in AI, we conduct extensive experiments across diverse datasets and learning tasks to evaluate their empirical performance. The results underscore the versatility, interpretability, and robustness of the $\mathrm{RGX}_p$ metrics as a principled foundation for developing trustworthy and SAFE AI systems.
