Explainable AI needs formalization

Stefan Haufe; Rick Wilming; Benedict Clark; Rustam Zhumagambetov; Ahcène Boubekki; Jörg Martin; Danny Panknin

Explainable AI needs formalization

Stefan Haufe, Rick Wilming, Benedict Clark, Rustam Zhumagambetov, Ahcène Boubekki, Jörg Martin, Danny Panknin

TL;DR

Researchers should formally define the problems they intend to solve and design methods accordingly, which will lead to diverse use-case-dependent notions of explanation correctness and objective metrics of explanation performance that can be used to validate XAI algorithms.

Abstract

The field of "explainable artificial intelligence" (XAI) seemingly addresses the desire that decisions of machine learning systems should be human-understandable. However, in its current state, XAI itself needs scrutiny. Popular methods cannot reliably answer relevant questions about ML models, their training data, or test inputs, because they systematically attribute importance to input features that are independent of the prediction target. This limits the utility of XAI for diagnosing and correcting data and models, for scientific discovery, and for identifying intervention targets. The fundamental reason for this is that current XAI methods do not address well-defined problems and are not evaluated against targeted criteria of explanation correctness. Researchers should formally define the problems they intend to solve and design methods accordingly. This will lead to diverse use-case-dependent notions of explanation correctness and objective metrics of explanation performance that can be used to validate XAI algorithms.

Explainable AI needs formalization

TL;DR

Abstract

Paper Structure (33 sections, 1 figure, 1 table)

This paper contains 33 sections, 1 figure, 1 table.

Introduction
Desired purposes of XAI
Current XAI does not serve desired purposes
Structural limitations of current XAI research
Towards using XAI for well-defined purposes
Discussion and Outlook
Acknowledgments
Author Contributions
Competing Interests

Figures (1)

Figure 1: a/b) Data sampled from the generative model (Example A) introduced in \ref{['sec:examples']}wilming2023theoretical for two different correlations $c$ and constant variances $s_1^2 = 0.8$ and $s_2^2=0.5$. Boundaries of the Bayes-optimal decisions are shown as well. The marginal sample distributions illustrate that feature $X_2$ does not carry any class-related information. c) Causal structure of the data in Examples A (left) and B (right). $X_2$ is a so-called suppressor variable that has no statistical association with the target $Y$, although both influence feature $X_1$, which is called a collider. Figure partially adopted from Wilming et al. wilming2023theoretical.

Theorems & Definitions (1)

Definition 1: Statistical Association Property, SAP

Explainable AI needs formalization

TL;DR

Abstract

Explainable AI needs formalization

Authors

TL;DR

Abstract

Table of Contents

Figures (1)

Theorems & Definitions (1)