Table of Contents
Fetching ...

A Survey on Human-Centered Evaluation of Explainable AI Methods in Clinical Decision Support Systems

Alessandro Gambetti, Qiwei Han, Hong Shen, Claudia Soares

TL;DR

The paper addresses the limited systematic evidence on how explainable AI (XAI) explanations influence real-world adoption of clinical decision support systems (CDSS). It conducts a PRISMA-guided survey of 31 human-centered evaluations, revealing a dominance of post-hoc, model-agnostic explanations (e.g., SHAP, Grad-CAM) tested in small clinician samples, with explanations often increasing cognitive load and misaligning with domain reasoning. It contributes a socio-technical, stakeholder-centric evaluation framework and an iterative development protocol to align XAI with diverse stakeholders, aiming to produce trustworthy, clinically viable CDSS. The work has practical significance for guiding future CDSS design, evaluation, and deployment in healthcare, emphasizing augmentation of clinician expertise rather than replacement and highlighting areas for regulatory and workflow considerations.

Abstract

Explainable Artificial Intelligence (XAI) is essential for the transparency and clinical adoption of Clinical Decision Support Systems (CDSS). However, the real-world effectiveness of existing XAI methods remains limited and is inconsistently evaluated. This study conducts a systematic PRISMA-guided survey of 31 human-centered evaluations (HCE) of XAI applied to CDSS, classifying them by XAI methodology, evaluation design, and adoption barrier. Our findings reveal that most existing studies employ post-hoc, model-agnostic approaches such as SHAP and Grad-CAM, typically assessed through small-scale clinician studies. The results show that over 80% of the studies adopt post-hoc, model-agnostic approaches such as SHAP and Grad-CAM, and that clinician sample sizes remain below 25 participants. The findings indicate that explanations generally improve clinician trust and diagnostic confidence, but frequently increase cognitive load and exhibit misalignment with domain reasoning processes. To bridge these gaps, we propose a stakeholder-centric evaluation framework that integrates socio-technical principles and human-computer interaction to guide the future development of clinically viable and trustworthy XAI-based CDSS.

A Survey on Human-Centered Evaluation of Explainable AI Methods in Clinical Decision Support Systems

TL;DR

The paper addresses the limited systematic evidence on how explainable AI (XAI) explanations influence real-world adoption of clinical decision support systems (CDSS). It conducts a PRISMA-guided survey of 31 human-centered evaluations, revealing a dominance of post-hoc, model-agnostic explanations (e.g., SHAP, Grad-CAM) tested in small clinician samples, with explanations often increasing cognitive load and misaligning with domain reasoning. It contributes a socio-technical, stakeholder-centric evaluation framework and an iterative development protocol to align XAI with diverse stakeholders, aiming to produce trustworthy, clinically viable CDSS. The work has practical significance for guiding future CDSS design, evaluation, and deployment in healthcare, emphasizing augmentation of clinician expertise rather than replacement and highlighting areas for regulatory and workflow considerations.

Abstract

Explainable Artificial Intelligence (XAI) is essential for the transparency and clinical adoption of Clinical Decision Support Systems (CDSS). However, the real-world effectiveness of existing XAI methods remains limited and is inconsistently evaluated. This study conducts a systematic PRISMA-guided survey of 31 human-centered evaluations (HCE) of XAI applied to CDSS, classifying them by XAI methodology, evaluation design, and adoption barrier. Our findings reveal that most existing studies employ post-hoc, model-agnostic approaches such as SHAP and Grad-CAM, typically assessed through small-scale clinician studies. The results show that over 80% of the studies adopt post-hoc, model-agnostic approaches such as SHAP and Grad-CAM, and that clinician sample sizes remain below 25 participants. The findings indicate that explanations generally improve clinician trust and diagnostic confidence, but frequently increase cognitive load and exhibit misalignment with domain reasoning processes. To bridge these gaps, we propose a stakeholder-centric evaluation framework that integrates socio-technical principles and human-computer interaction to guide the future development of clinically viable and trustworthy XAI-based CDSS.

Paper Structure

This paper contains 19 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: The survey exclusively focuses on works at the intersection of Healthcare, Explainable AI, and Human-Centered Evaluations in the yellow region
  • Figure 2: PRISMA Flow Diagram detailing the systematic process of identifying, screening, and selecting studies for inclusion.
  • Figure 3: Histogram of retrieved papers per year.
  • Figure 4: Distribution of top XAI and HCE methodologies and top medical fields where CDSS were used.