Probabilistic Scoring Lists for Interpretable Machine Learning
Jonas Hanselle, Stefan Heid, Johannes Fürnkranz, Eyke Hüllermeier
TL;DR
This work introduces Probabilistic Scoring Lists (PSL), an uncertainty-aware extension of scoring systems that returns probability distributions rather than deterministic decisions and evaluates features sequentially in a decision-list fashion. The authors develop a greedy learning algorithm, probability calibration (isotonic regression and beta calibration), in-search feature binarization, and a framework for capturing epistemic uncertainty via confidence intervals, along with a ranking variant using a pairwise SRL objective. They validate PSLs on medical and UCI datasets, showing that entropy-minimization drives effective feature selection, calibration quality is competitive with LR on several datasets, and the approach supports risk-aware decision-making and ranking with interpretable, stage-wise progress. The work highlights practical implications for cost-effective data collection and transparent, robust decision support in domains where safety and interpretability are paramount, while also outlining avenues for regularization, multi-class extension, and tighter uncertainty quantification.
Abstract
A scoring system is a simple decision model that checks a set of features, adds a certain number of points to a total score for each feature that is satisfied, and finally makes a decision by comparing the total score to a threshold. Scoring systems have a long history of active use in safety-critical domains such as healthcare and justice, where they provide guidance for making objective and accurate decisions. Given their genuine interpretability, the idea of learning scoring systems from data is obviously appealing from the perspective of explainable AI. In this paper, we propose a practically motivated extension of scoring systems called probabilistic scoring lists (PSL), as well as a method for learning PSLs from data. Instead of making a deterministic decision, a PSL represents uncertainty in the form of probability distributions, or, more generally, probability intervals. Moreover, in the spirit of decision lists, a PSL evaluates features one by one and stops as soon as a decision can be made with enough confidence. To evaluate our approach, we conduct a case study in the medical domain.
