What Does Evaluation of Explainable Artificial Intelligence Actually Tell Us? A Case for Compositional and Contextual Validation of XAI Building Blocks

Kacper Sokol; Julia E. Vogt

What Does Evaluation of Explainable Artificial Intelligence Actually Tell Us? A Case for Compositional and Contextual Validation of XAI Building Blocks

Kacper Sokol, Julia E. Vogt

TL;DR

A comprehensive sociotechnical utility-based evaluation framework could allow to systematically reason about the properties and downstream influence of different building blocks from which explainable artificial intelligence systems are composed – accounting for a diverse range of their engineering and social aspects – in view of the anticipated use case.

Abstract

Despite significant progress, evaluation of explainable artificial intelligence remains elusive and challenging. In this paper we propose a fine-grained validation framework that is not overly reliant on any one facet of these sociotechnical systems, and that recognises their inherent modular structure: technical building blocks, user-facing explanatory artefacts and social communication protocols. While we concur that user studies are invaluable in assessing the quality and effectiveness of explanation presentation and delivery strategies from the explainees' perspective in a particular deployment context, the underlying explanation generation mechanisms require a separate, predominantly algorithmic validation strategy that accounts for the technical and human-centred desiderata of their (numerical) outputs. Such a comprehensive sociotechnical utility-based evaluation framework could allow to systematically reason about the properties and downstream influence of different building blocks from which explainable artificial intelligence systems are composed -- accounting for a diverse range of their engineering and social aspects -- in view of the anticipated use case.

What Does Evaluation of Explainable Artificial Intelligence Actually Tell Us? A Case for Compositional and Contextual Validation of XAI Building Blocks

TL;DR

Abstract

Paper Structure (13 sections, 2 figures)

This paper contains 13 sections, 2 figures.

Evaluation Purposes
Evaluation Approaches
Evaluation Frameworks
Algorithmic Evaluation
User-centred Evaluation
Evaluation Deficiencies
Inconsistent Evaluation Findings
Neglected Explainability Context
Oversimplified Human Explanatory Process
Misconstrued Explainer Structure
Overlooked Model Quality
Sociotechnical Utility-based Evaluation Framework
Conclusion and Future Work

Figures (2)

Figure 1: Depiction of our (rudimentary) sociotechnical utility-based evaluation framework illustrating functionally independent building blocks of XAI systems. This example uses linear (LIME ribeiro2016why) and tree-based (LIMEtree sokol2020limetree) surrogate explainers.
Figure 2: High-level view of our utility-based evaluation framework that separates technical correctness of explanatory insights from social effectiveness of their delivery.

What Does Evaluation of Explainable Artificial Intelligence Actually Tell Us? A Case for Compositional and Contextual Validation of XAI Building Blocks

TL;DR

Abstract

What Does Evaluation of Explainable Artificial Intelligence Actually Tell Us? A Case for Compositional and Contextual Validation of XAI Building Blocks

Authors

TL;DR

Abstract

Table of Contents

Figures (2)