Probabilistic Scoring Lists for Interpretable Machine Learning

Jonas Hanselle; Stefan Heid; Johannes Fürnkranz; Eyke Hüllermeier

Probabilistic Scoring Lists for Interpretable Machine Learning

Jonas Hanselle, Stefan Heid, Johannes Fürnkranz, Eyke Hüllermeier

TL;DR

This work introduces Probabilistic Scoring Lists (PSL), an uncertainty-aware extension of scoring systems that returns probability distributions rather than deterministic decisions and evaluates features sequentially in a decision-list fashion. The authors develop a greedy learning algorithm, probability calibration (isotonic regression and beta calibration), in-search feature binarization, and a framework for capturing epistemic uncertainty via confidence intervals, along with a ranking variant using a pairwise SRL objective. They validate PSLs on medical and UCI datasets, showing that entropy-minimization drives effective feature selection, calibration quality is competitive with LR on several datasets, and the approach supports risk-aware decision-making and ranking with interpretable, stage-wise progress. The work highlights practical implications for cost-effective data collection and transparent, robust decision support in domains where safety and interpretability are paramount, while also outlining avenues for regularization, multi-class extension, and tighter uncertainty quantification.

Abstract

A scoring system is a simple decision model that checks a set of features, adds a certain number of points to a total score for each feature that is satisfied, and finally makes a decision by comparing the total score to a threshold. Scoring systems have a long history of active use in safety-critical domains such as healthcare and justice, where they provide guidance for making objective and accurate decisions. Given their genuine interpretability, the idea of learning scoring systems from data is obviously appealing from the perspective of explainable AI. In this paper, we propose a practically motivated extension of scoring systems called probabilistic scoring lists (PSL), as well as a method for learning PSLs from data. Instead of making a deterministic decision, a PSL represents uncertainty in the form of probability distributions, or, more generally, probability intervals. Moreover, in the spirit of decision lists, a PSL evaluates features one by one and stops as soon as a decision can be made with enough confidence. To evaluate our approach, we conduct a case study in the medical domain.

Probabilistic Scoring Lists for Interpretable Machine Learning

TL;DR

Abstract

Paper Structure (19 sections, 22 equations, 7 figures, 1 table)

This paper contains 19 sections, 22 equations, 7 figures, 1 table.

Introduction
Related Work
Probabilistic Scoring Lists
Learning Probabilistic Scoring Lists
A Greedy Learning Algorithm
Probability Estimation
Feature Binarization
Beyond Probabilities: Capturing Epistemic Uncertainty
Ranking
Empirical Evaluation
Datasets
Coronary Heart Disease Data
UCI Datasets
RQ1: Expected Entropy Minimization
RQ2: Investigating Probability Estimates
...and 4 more sections

Figures (7)

Figure 1: Example of calibration with isotonic regression and beta calibration, using the medical dataset introduced in Section 5. The values on the x-axis correspond to the total scores. As class labels are either 0 or 1, the data points $\mathcal{C}$ are plotted with jittering for better visualization.
Figure 2: Evaluation of the greedy learning algorithm (blue line) on the coronary heart disease dataset. The light blue lines show the complete search space induced by all feature permutations and possible score assignments. The features, selected by the greedy algorithm in every stage, are also labelled on the x-axis. The visualization was created for a score set $\mathcal{S} = \{1,2,3\}$.
Figure 4: Stagewise Brier Score for PSLs on the test datasets.
Figure 5: Probability estimates of all possible total scores for the first 4 stages and stage 7 of the PSL, trained on the full CHD dataset. The error bars show the 95% confidence interval described in Section \ref{['sec:beyond_probabilities']}.
Figure 6: Expected loss, calculated using the upper confidence bound of the 50% confidence interval.
...and 2 more figures

Theorems & Definitions (3)

Definition 1: Scoring system
Definition 2: Probabilistic scoring system, PSS
Definition 3: Probabilistic scoring list, PSL

Probabilistic Scoring Lists for Interpretable Machine Learning

TL;DR

Abstract

Probabilistic Scoring Lists for Interpretable Machine Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (7)

Theorems & Definitions (3)