Table of Contents
Fetching ...

Understanding Prediction Discrepancies in Machine Learning Classifiers

Xavier Renard, Thibault Laugel, Marcin Detyniecki

TL;DR

The paper defines prediction discrepancies among equi-performing classifiers trained on the same data and introduces Discrepancy Interval Generation (DIG), a model-agnostic method that learns and explains local discrepancy regions as counterfactual intervals. DIG builds a graph of training points to precompute discrepancy borders and then retrieves the closest intervals for new instances, providing grounded, actionable explanations to inform model selection and risk management. Empirical results across multiple tabular datasets (and extensions to image data via DIG-CV) show widespread discrepancies, demonstrate DIG's superior accuracy and efficiency over baselines and adapted XAI methods, and illustrate practical use cases such as German Credit for local explanations and global discrepancy insights. The work highlights the practical impact of understanding where and why models with similar performance disagree, enabling better debugging, safer deployment, and more informed auditing of ML systems.

Abstract

A multitude of classifiers can be trained on the same data to achieve similar performances during test time, while having learned significantly different classification patterns. This phenomenon, which we call prediction discrepancies, is often associated with the blind selection of one model instead of another with similar performances. When making a choice, the machine learning practitioner has no understanding on the differences between models, their limits, where they agree and where they don't. But his/her choice will result in concrete consequences for instances to be classified in the discrepancy zone, since the final decision will be based on the selected classification pattern. Besides the arbitrary nature of the result, a bad choice could have further negative consequences such as loss of opportunity or lack of fairness. This paper proposes to address this question by analyzing the prediction discrepancies in a pool of best-performing models trained on the same data. A model-agnostic algorithm, DIG, is proposed to capture and explain discrepancies locally, to enable the practitioner to make the best educated decision when selecting a model by anticipating its potential undesired consequences. All the code to reproduce the experiments is available.

Understanding Prediction Discrepancies in Machine Learning Classifiers

TL;DR

The paper defines prediction discrepancies among equi-performing classifiers trained on the same data and introduces Discrepancy Interval Generation (DIG), a model-agnostic method that learns and explains local discrepancy regions as counterfactual intervals. DIG builds a graph of training points to precompute discrepancy borders and then retrieves the closest intervals for new instances, providing grounded, actionable explanations to inform model selection and risk management. Empirical results across multiple tabular datasets (and extensions to image data via DIG-CV) show widespread discrepancies, demonstrate DIG's superior accuracy and efficiency over baselines and adapted XAI methods, and illustrate practical use cases such as German Credit for local explanations and global discrepancy insights. The work highlights the practical impact of understanding where and why models with similar performance disagree, enabling better debugging, safer deployment, and more informed auditing of ML systems.

Abstract

A multitude of classifiers can be trained on the same data to achieve similar performances during test time, while having learned significantly different classification patterns. This phenomenon, which we call prediction discrepancies, is often associated with the blind selection of one model instead of another with similar performances. When making a choice, the machine learning practitioner has no understanding on the differences between models, their limits, where they agree and where they don't. But his/her choice will result in concrete consequences for instances to be classified in the discrepancy zone, since the final decision will be based on the selected classification pattern. Besides the arbitrary nature of the result, a bad choice could have further negative consequences such as loss of opportunity or lack of fairness. This paper proposes to address this question by analyzing the prediction discrepancies in a pool of best-performing models trained on the same data. A model-agnostic algorithm, DIG, is proposed to capture and explain discrepancies locally, to enable the practitioner to make the best educated decision when selecting a model by anticipating its potential undesired consequences. All the code to reproduce the experiments is available.

Paper Structure

This paper contains 46 sections, 4 equations, 17 figures, 5 tables, 2 algorithms.

Figures (17)

  • Figure 1: Illustration of prediction discrepancies of a pool of 10 models trained over the half-moons dataset (dots) using Autogluon. Each colored line represents the decision boundary of a classifier.
  • Figure 2: Proportion of prediction discrepancies among the 72 datasets of OpenML-CC18 benchmark suite. Each dot is a dataset where the ordinate is the proportion of instances affected by prediction discrepancies, among the best models submitted on OpenML by ML practitioners (in a $2\%$-comparable pool of classifiers).
  • Figure 3: Illustration of the principle of DIG on a toy dataset (red and blue points, colored depending on their true label) and a pool of 2 classifiers (yellow/green lines). Discrepancy regions to be detected by DIG are represented by hatched areas.
  • Figure 4: Proposed architecture for DIG-CV.
  • Figure 5: Sampling strategies of DIG (left) and KDE (right).
  • ...and 12 more figures

Theorems & Definitions (3)

  • Definition 1: equi-performing pool
  • Definition 2: $\epsilon$-comparable pool
  • Definition 3: Pool discrepancy