OpenHEXAI: An Open-Source Framework for Human-Centered Evaluation of Explainable Machine Learning

Jiaqi Ma; Vivian Lai; Yiming Zhang; Chacha Chen; Paul Hamilton; Davor Ljubenkov; Himabindu Lakkaraju; Chenhao Tan

OpenHEXAI: An Open-Source Framework for Human-Centered Evaluation of Explainable Machine Learning

Jiaqi Ma, Vivian Lai, Yiming Zhang, Chacha Chen, Paul Hamilton, Davor Ljubenkov, Himabindu Lakkaraju, Chenhao Tan

TL;DR

OpenHEAXI is the first large-scale infrastructural effort to facilitate human-centered benchmarks of XAI methods, which simplifies the design and implementation of user studies for XAI methods, thus allowing researchers and practitioners to focus on the scientific questions.

Abstract

Recently, there has been a surge of explainable AI (XAI) methods driven by the need for understanding machine learning model behaviors in high-stakes scenarios. However, properly evaluating the effectiveness of the XAI methods inevitably requires the involvement of human subjects, and conducting human-centered benchmarks is challenging in a number of ways: designing and implementing user studies is complex; numerous design choices in the design space of user study lead to problems of reproducibility; and running user studies can be challenging and even daunting for machine learning researchers. To address these challenges, this paper presents OpenHEXAI, an open-source framework for human-centered evaluation of XAI methods. OpenHEXAI features (1) a collection of diverse benchmark datasets, pre-trained models, and post hoc explanation methods; (2) an easy-to-use web application for user study; (3) comprehensive evaluation metrics for the effectiveness of post hoc explanation methods in the context of human-AI decision making tasks; (4) best practice recommendations of experiment documentation; and (5) convenient tools for power analysis and cost estimation. OpenHEAXI is the first large-scale infrastructural effort to facilitate human-centered benchmarks of XAI methods. It simplifies the design and implementation of user studies for XAI methods, thus allowing researchers and practitioners to focus on the scientific questions. Additionally, it enhances reproducibility through standardized designs. Based on OpenHEXAI, we further conduct a systematic benchmark of four state-of-the-art post hoc explanation methods and compare their impacts on human-AI decision making tasks in terms of accuracy, fairness, as well as users' trust and understanding of the machine learning model.

OpenHEXAI: An Open-Source Framework for Human-Centered Evaluation of Explainable Machine Learning

TL;DR

Abstract

Paper Structure (36 sections, 3 figures, 8 tables)

This paper contains 36 sections, 3 figures, 8 tables.

Introduction
Literature Survey
Task Scenarios.
Post hoc Explanation Methods.
Research Questions.
The OpenHEXAI Framework
Benchmark Scenario
Machine Learning Module
Web Application Module
User Study Task Flow.
Adaptable and Reusable Design.
Evaluation Module
Evaluation Card
The design of the evaluation card.
Power Analysis and Cost Estimate
...and 21 more sections

Figures (3)

Figure 1: This figure illustrates the task page for the RCDV dataset and conditions with the predicted label and explanations. (1) shows a box including explanations for features that require more explanations. (2) shows the profile of a defendant. (3) shows the predicted label. (4) is a description of how the bar chart could be interpreted. Finally, (5) shows the bar chart that orders features by their absolute feature importance scores.
Figure 2: This figure illustrates the task page for the control data feature only condition (F) on the German Credit dataset. There are two main components on this page, the features explanations box and the profile table. The features explanation box has more information on features that might be difficult to understand based on a short description. The profile in the table shows the information on the respective profile the user is required to predict.
Figure 3: This figure illustrates the task page for the control data feature and model prediction condition (FP) on the RCDV dataset. In addition to the features explanations box and the profile table shown in Figure \ref{['fig:german_control']}, there is an additional AI prediction on top of the profile table.

OpenHEXAI: An Open-Source Framework for Human-Centered Evaluation of Explainable Machine Learning

TL;DR

Abstract

OpenHEXAI: An Open-Source Framework for Human-Centered Evaluation of Explainable Machine Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (3)