Table of Contents
Fetching ...

Comprehension Is a Double-Edged Sword: Over-Interpreting Unspecified Information in Intelligible Machine Learning Explanations

Yueqing Xuan, Edward Small, Kacper Sokol, Danula Hettiachchi, Mark Sanderson

TL;DR

It is found that highly comprehensible explanations, e.g., feature importance and decision surface visualisation, are exceptionally susceptible to misinterpretation since users tend to infer spurious information that is outside of the scope of these explanations.

Abstract

Automated decision-making systems are becoming increasingly ubiquitous, which creates an immediate need for their interpretability and explainability. However, it remains unclear whether users know what insights an explanation offers and, more importantly, what information it lacks. To answer this question we conducted an online study with 200 participants, which allowed us to assess explainees' ability to realise explicated information -- i.e., factual insights conveyed by an explanation -- and unspecified information -- i.e, insights that are not communicated by an explanation -- across four representative explanation types: model architecture, decision surface visualisation, counterfactual explainability and feature importance. Our findings uncover that highly comprehensible explanations, e.g., feature importance and decision surface visualisation, are exceptionally susceptible to misinterpretation since users tend to infer spurious information that is outside of the scope of these explanations. Additionally, while the users gauge their confidence accurately with respect to the information explicated by these explanations, they tend to be overconfident when misinterpreting the explanations. Our work demonstrates that human comprehension can be a double-edged sword since highly accessible explanations may convince users of their truthfulness while possibly leading to various misinterpretations at the same time. Machine learning explanations should therefore carefully navigate the complex relation between their full scope and limitations to maximise understanding and curb misinterpretation.

Comprehension Is a Double-Edged Sword: Over-Interpreting Unspecified Information in Intelligible Machine Learning Explanations

TL;DR

It is found that highly comprehensible explanations, e.g., feature importance and decision surface visualisation, are exceptionally susceptible to misinterpretation since users tend to infer spurious information that is outside of the scope of these explanations.

Abstract

Automated decision-making systems are becoming increasingly ubiquitous, which creates an immediate need for their interpretability and explainability. However, it remains unclear whether users know what insights an explanation offers and, more importantly, what information it lacks. To answer this question we conducted an online study with 200 participants, which allowed us to assess explainees' ability to realise explicated information -- i.e., factual insights conveyed by an explanation -- and unspecified information -- i.e, insights that are not communicated by an explanation -- across four representative explanation types: model architecture, decision surface visualisation, counterfactual explainability and feature importance. Our findings uncover that highly comprehensible explanations, e.g., feature importance and decision surface visualisation, are exceptionally susceptible to misinterpretation since users tend to infer spurious information that is outside of the scope of these explanations. Additionally, while the users gauge their confidence accurately with respect to the information explicated by these explanations, they tend to be overconfident when misinterpreting the explanations. Our work demonstrates that human comprehension can be a double-edged sword since highly accessible explanations may convince users of their truthfulness while possibly leading to various misinterpretations at the same time. Machine learning explanations should therefore carefully navigate the complex relation between their full scope and limitations to maximise understanding and curb misinterpretation.
Paper Structure (35 sections, 13 figures, 11 tables)

This paper contains 35 sections, 13 figures, 11 tables.

Figures (13)

  • Figure 1: Overview of the user study workflow employed to assess user comprehension and perception of different explanations. In the actual survey, Step 1 is shown in a separate screen; once a participant clicks Next, Steps 2--4 are shown in sequence in a single new screen. Clicking Next again leads to an updated screen showing Steps 2--4 for a new explanation. For each explanation, the participant can hover their mouse over the Descriptions keyword (visible in Step 2), which triggers a drop-down box showing information from Step 1. Therefore, the participant can always revisit model information in case they forget the details.
  • Figure 2: Accuracy of the participants' answers to the question about explicated and unspecified information stratified by explanation type is shown in Panel (\ref{['fig:overall_compare_0']}). Participants were significantly more likely to understand explicated information shown in feature importance and decision surface visualisation but less attuned to their unspecified information, in contrast to counterfactuals and model architecture. The average user comprehension score for the four questions about explicated information and the four questions about unspecified information grouped by explanation type is shown in Panel (\ref{['fig:overall_compare_1']}) for logistic regression and Panel (\ref{['fig:overall_compare_2']}) for decision tree. Participants were significantly more likely to have correct comprehension of the information unspecified by the explanations of logistic regression compared to decision tree. Average score for all comprehension questions (including all eight questions about explicated and unspecified information) stratified by the ML model type is shown in Panel (\ref{['fig:overall_compare_3']}). All error bars indicate 95% confidence interval.
  • Figure 3: Accuracy of our participants' answers to comprehension questions about explicated and unspecified information grouped by explanation type, separately for our two ML models. Error bars indicate 95% confidence interval. The participants' comprehension of explicated information is substantially more accurate than their comprehension of unspecified information for every explanation type of logistic regression and three explanation types of decision tree (except counterfactual explainability).
  • Figure 4: Overview of our participants' performance for questions about explicated and unspecified information and their confidence level. Panel (\ref{['fig:tf-ct-comp']}) suggests that the participants with different levels of performance on the questions about explicated information attained comparable average score for the questions about unspecified information. Panel (\ref{['fig:tf-calibrated']}) shows participants with different levels of performance for questions about explicated information and the confidence in their answers. Panel (\ref{['fig:ct-calibrated']}) shows participants with different levels of performance for questions about unspecified information and the confidence in their answers. Error bars indicate 95% confidence interval.
  • Figure 5: Overview of our participants' confidence in their answers to comprehension questions about explicated and unspecified information is shown in Panel (\ref{['fig:confidence_0']}). Answer confidence of participants who identified explicated information correctly (displayed in green) and incorrectly (displayed in red) is shown in Panel (\ref{['fig:confidence_1']}). Answer confidence of participants who identified unspecified information correctly (displayed in green) and incorrectly (displayed in red) is shown in Panel (\ref{['fig:confidence_2']}). Participants who answered the questions about explicated information correctly reported significantly higher confidence in their answers than their peers who did not. On the other hand, participants who answered the questions about unspecified information correctly were less confident than those who answered them incorrectly. Error bars indicate 95% confidence interval.
  • ...and 8 more figures