Table of Contents
Fetching ...

Evaluating the Influences of Explanation Style on Human-AI Reliance

Emma Casolin, Flora D. Salim, Ben Newell

TL;DR

This study investigates the influences of feature-based, example-based, and combined feature- and example-based XAI methods on human-AI reliance through a two-part experimental study with 274 participants comparing these explanation style conditions.

Abstract

Explainable AI (XAI) aims to support appropriate human-AI reliance by increasing the interpretability of complex model decisions. Despite the proliferation of proposed methods, there is mixed evidence surrounding the effects of different styles of XAI explanations on human-AI reliance. Interpreting these conflicting findings requires an understanding of the individual and combined qualities of different explanation styles that influence appropriate and inappropriate human-AI reliance, and the role of interpretability in this interaction. In this study, we investigate the influences of feature-based, example-based, and combined feature- and example-based XAI methods on human-AI reliance through a two-part experimental study with 274 participants comparing these explanation style conditions. Our findings suggest differences between feature-based and example-based explanation styles beyond interpretability that affect human-AI reliance patterns across differences in individual performance and task complexity. Our work highlights the importance of adapting explanations to their specific users and context over maximising broad interpretability.

Evaluating the Influences of Explanation Style on Human-AI Reliance

TL;DR

This study investigates the influences of feature-based, example-based, and combined feature- and example-based XAI methods on human-AI reliance through a two-part experimental study with 274 participants comparing these explanation style conditions.

Abstract

Explainable AI (XAI) aims to support appropriate human-AI reliance by increasing the interpretability of complex model decisions. Despite the proliferation of proposed methods, there is mixed evidence surrounding the effects of different styles of XAI explanations on human-AI reliance. Interpreting these conflicting findings requires an understanding of the individual and combined qualities of different explanation styles that influence appropriate and inappropriate human-AI reliance, and the role of interpretability in this interaction. In this study, we investigate the influences of feature-based, example-based, and combined feature- and example-based XAI methods on human-AI reliance through a two-part experimental study with 274 participants comparing these explanation style conditions. Our findings suggest differences between feature-based and example-based explanation styles beyond interpretability that affect human-AI reliance patterns across differences in individual performance and task complexity. Our work highlights the importance of adapting explanations to their specific users and context over maximising broad interpretability.

Paper Structure

This paper contains 27 sections, 5 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: The three stages of the Judge-Advisor System (JAS) Framework.
  • Figure 2: Experimental design of our study. Participants are be randomly allocated to one of four explanation-style groups after reading and agreeing to the consent form, remaining assigned to this condition for both Part 1 and Part 2. Before beginning each Part, participants are provided information about task followed by a short quiz to confirm their understanding.
  • Figure 3: Equivalence of information as determined through combined explanations between an image (left) and its two nearest-neighbour example-based explanations (middle and right). Ovals of the same colour indicate features shared across the three images.
  • Figure 4: Participants' average initial and final accuracy per experimental condition, relative to the accuracy of the AI.
  • Figure 5: Participants' average initial and final accuracy per experimental condition split by task ability, relative to the accuracy of the AI. Error bars show standard error of the mean.
  • ...and 2 more figures