Evaluating Explainability in Machine Learning Predictions through Explainer-Agnostic Metrics

Cristian Munoz; Kleyton da Costa; Bernardo Modenesi; Adriano Koshiyama

Evaluating Explainability in Machine Learning Predictions through Explainer-Agnostic Metrics

Cristian Munoz, Kleyton da Costa, Bernardo Modenesi, Adriano Koshiyama

TL;DR

Six distinct model-agnostic metrics designed to quantify the extent to which model predictions can be explained are developed, allowing for a comprehensive evaluation of how models generate their outputs.

Abstract

The rapid integration of artificial intelligence (AI) into various industries has introduced new challenges in governance and regulation, particularly regarding the understanding of complex AI systems. A critical demand from decision-makers is the ability to explain the results of machine learning models, which is essential for fostering trust and ensuring ethical AI practices. In this paper, we develop six distinct model-agnostic metrics designed to quantify the extent to which model predictions can be explained. These metrics measure different aspects of model explainability, ranging from local importance, global importance, and surrogate predictions, allowing for a comprehensive evaluation of how models generate their outputs. Furthermore, by computing our metrics, we can rank models in terms of explainability criteria such as importance concentration and consistency, prediction fluctuation, and surrogate fidelity and stability, offering a valuable tool for selecting models based not only on accuracy but also on transparency. We demonstrate the practical utility of these metrics on classification and regression tasks, and integrate these metrics into an existing Python package for public use.

Evaluating Explainability in Machine Learning Predictions through Explainer-Agnostic Metrics

TL;DR

Abstract

Paper Structure (29 sections, 17 equations, 4 figures, 2 tables)

This paper contains 29 sections, 17 equations, 4 figures, 2 tables.

Introduction
Background and Literature
Proposed explainer-agnostic metrics
Metrics Based on Global Feature Importance
Feature Importance Spread
$\alpha$-feature importance
Fluctuation Ratio
Rank Alignment
Metrics Based on Local Feature Importance
Rank Consistency
Importance Stability
Metrics Based on Surrogate Models
Performance Degradation
Surrogate Fidelity
Surrogate Feature Stability
...and 14 more sections

Figures (4)

Figure 1: A simplified representation of explainer-agnostic metrics (EAMEX) framework
Figure 2: Comparison of different feature importance analyses on the Adult Dataset for an ML model.
Figure 3: Comparison of different feature importance analyses on the US-Crime Dataset for an ML model.
Figure 4: Overall analysis of explainer-agnostic metrics for binary classification (a) and regression (b) tasks. The color areas represent global importance, local importance, and surrogate importance metrics. All metrics' reference values were standardized to facilitate interpretation, so a value of 1 is considered the reference or desired value for all metrics.

Theorems & Definitions (9)

Definition 3.1: Feature Importance Divergence
Definition 3.2: $\alpha$-Feature Importance
Definition 3.3: Fluctuation Ratio
Definition 3.4: Rank Alignment
Definition 3.5: Rank Consistency
Definition 3.6: Importance Stability
Definition 3.7: Performance Degradation
Definition 3.8: Surrogate Fidelity
Definition 3.9: Surrogate Feature Stability

Evaluating Explainability in Machine Learning Predictions through Explainer-Agnostic Metrics

TL;DR

Abstract

Evaluating Explainability in Machine Learning Predictions through Explainer-Agnostic Metrics

Authors

TL;DR

Abstract

Table of Contents

Figures (4)

Theorems & Definitions (9)