Evaluation of Human-Understandability of Global Model Explanations using Decision Tree

Adarsa Sivaprasad; Ehud Reiter; Nava Tintarev; Nir Oren

Evaluation of Human-Understandability of Global Model Explanations using Decision Tree

Adarsa Sivaprasad, Ehud Reiter, Nava Tintarev, Nir Oren

TL;DR

This paper investigates end-user understandability of global model explanations versus local explanations in coronary heart disease risk prediction using narrative, patient-specific decision-tree explanations. It employs a GOSDT-learned depth-4/5 decision tree framework to generate local/easy, local/hard, global/easy, and global/hard explanations, comparing them to SHAP baselines in a within-subject study with $50$ participants and five scenarios, measuring $CR$, $UR$, $VR$, $D_m$, and $D_c$. The findings reveal no universal superiority of global explanations; rather, individual user groups show distinct preferences (local for some, global for others) and harder explanations increase errors, with SHAP explanations offering comparable understandability to certain local explanations but not consistently improving comprehension. The study yields actionable insights for health informatics design, emphasizing personalized narration, cognitive-load considerations, and the need for broader validation and extension to regression tasks and robust probability/confidence communication. The results highlight the potential of narrative global explanations to enhance trust and actionability in healthcare AI, while also outlining key limitations and directions for future work, including larger datasets, representative patient samples, and improved explanation-generation techniques with principled handling of semifactuals and uncertainty.

Abstract

In explainable artificial intelligence (XAI) research, the predominant focus has been on interpreting models for experts and practitioners. Model agnostic and local explanation approaches are deemed interpretable and sufficient in many applications. However, in domains like healthcare, where end users are patients without AI or domain expertise, there is an urgent need for model explanations that are more comprehensible and instil trust in the model's operations. We hypothesise that generating model explanations that are narrative, patient-specific and global(holistic of the model) would enable better understandability and enable decision-making. We test this using a decision tree model to generate both local and global explanations for patients identified as having a high risk of coronary heart disease. These explanations are presented to non-expert users. We find a strong individual preference for a specific type of explanation. The majority of participants prefer global explanations, while a smaller group prefers local explanations. A task based evaluation of mental models of these participants provide valuable feedback to enhance narrative global explanations. This, in turn, guides the design of health informatics systems that are both trustworthy and actionable.

Evaluation of Human-Understandability of Global Model Explanations using Decision Tree

TL;DR

participants and five scenarios, measuring

, and

. The findings reveal no universal superiority of global explanations; rather, individual user groups show distinct preferences (local for some, global for others) and harder explanations increase errors, with SHAP explanations offering comparable understandability to certain local explanations but not consistently improving comprehension. The study yields actionable insights for health informatics design, emphasizing personalized narration, cognitive-load considerations, and the need for broader validation and extension to regression tasks and robust probability/confidence communication. The results highlight the potential of narrative global explanations to enhance trust and actionability in healthcare AI, while also outlining key limitations and directions for future work, including larger datasets, representative patient samples, and improved explanation-generation techniques with principled handling of semifactuals and uncertainty.

Abstract

Paper Structure (15 sections, 2 equations, 9 figures, 12 tables)

This paper contains 15 sections, 2 equations, 9 figures, 12 tables.

Introduction
Related Work
Experiment Design
Dataset and Modeling
Generation of Explanation
Evaluation
Results and Discussion
Local vs. Global
Tree Explanation vs. SHAP
Easy vs. Hard
Limitations and Future Work
Construction and Selection of DT
Generating Explanation Narration
User Survey on Prolific
Comparison of Local and Global Explanation Ratings

Figures (9)

Figure 1: A comparison of Local SHAP, Local and Global tree explanation of CHD risk prediction using decision tree model. Different evaluation parameters are computed based on end-user feedback of the explanation.
Figure 2: An example of local and global narrative explanation of a DT. Note that this is one way of generating a global tree explanation (Appendix\ref{['appendix:genExpNarration']}). Listing all the nodes or stating all possible categorical values of features are design choices that will affect understandability.
Figure 3: Average rating for different explanation type across the participant groups
Figure 4: Depth 5 Decision tree generated on 2134 datapoints. Training accuracy = 90.9% , test accuracy on 534 records = 85%.
Figure 5: DTs for different scenarios. (a) Local easy scenario: Decision tree generated on 116 data points. Training accuracy = 78.4%, (b) Local Hard scenario: Decision tree generated on 163 data points. Training accuracy = 77.3%, (c) Global easy scenario: Decision tree generated on 382 data points. Training accuracy = 82.5% (d) Global Hard scenario: Decision tree generated on 108 data points. Training accuracy = 85.4%.
...and 4 more figures

Evaluation of Human-Understandability of Global Model Explanations using Decision Tree

TL;DR

Abstract

Evaluation of Human-Understandability of Global Model Explanations using Decision Tree

Authors

TL;DR

Abstract

Table of Contents

Figures (9)