Table of Contents
Fetching ...

On the Definition of Appropriate Trust and the Tools that Come with it

Helena Löfström

TL;DR

The paper tackles the challenge of evaluating explanation quality in XAI by reframing ‘appropriate trust’ as a performance-like metric for human users. It draws strong parallels between user trust and model performance, and introduces a rigorous framework that uses confusion-matrix–based metrics (precision, recall, F1) to quantify misuse, disuse, and overall appropriate trust (U_at). The authors extend the approach to regression via conformal-regression–style uncertainty and discuss incorporating user uncertainty through probability-estimate intervals (e.g., Venn-Abers). They illustrate the method with a real-world data example and argue that an objective, comparable metric for user evaluations enables robust comparative studies of explanation methods and their impact on trust calibration, with practical implications for responsible AI deployment.

Abstract

Evaluating the efficiency of human-AI interactions is challenging, including subjective and objective quality aspects. With the focus on the human experience of the explanations, evaluations of explanation methods have become mostly subjective, making comparative evaluations almost impossible and highly linked to the individual user. However, it is commonly agreed that one aspect of explanation quality is how effectively the user can detect if the predictions are trustworthy and correct, i.e., if the explanations can increase the user's appropriate trust in the model. This paper starts with the definitions of appropriate trust from the literature. It compares the definitions with model performance evaluation, showing the strong similarities between appropriate trust and model performance evaluation. The paper's main contribution is a novel approach to evaluating appropriate trust by taking advantage of the likenesses between definitions. The paper offers several straightforward evaluation methods for different aspects of user performance, including suggesting a method for measuring uncertainty and appropriate trust in regression.

On the Definition of Appropriate Trust and the Tools that Come with it

TL;DR

The paper tackles the challenge of evaluating explanation quality in XAI by reframing ‘appropriate trust’ as a performance-like metric for human users. It draws strong parallels between user trust and model performance, and introduces a rigorous framework that uses confusion-matrix–based metrics (precision, recall, F1) to quantify misuse, disuse, and overall appropriate trust (U_at). The authors extend the approach to regression via conformal-regression–style uncertainty and discuss incorporating user uncertainty through probability-estimate intervals (e.g., Venn-Abers). They illustrate the method with a real-world data example and argue that an objective, comparable metric for user evaluations enables robust comparative studies of explanation methods and their impact on trust calibration, with practical implications for responsible AI deployment.

Abstract

Evaluating the efficiency of human-AI interactions is challenging, including subjective and objective quality aspects. With the focus on the human experience of the explanations, evaluations of explanation methods have become mostly subjective, making comparative evaluations almost impossible and highly linked to the individual user. However, it is commonly agreed that one aspect of explanation quality is how effectively the user can detect if the predictions are trustworthy and correct, i.e., if the explanations can increase the user's appropriate trust in the model. This paper starts with the definitions of appropriate trust from the literature. It compares the definitions with model performance evaluation, showing the strong similarities between appropriate trust and model performance evaluation. The paper's main contribution is a novel approach to evaluating appropriate trust by taking advantage of the likenesses between definitions. The paper offers several straightforward evaluation methods for different aspects of user performance, including suggesting a method for measuring uncertainty and appropriate trust in regression.
Paper Structure (22 sections, 9 equations, 3 figures, 2 tables)

This paper contains 22 sections, 9 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Confusion matrix for model performance in Classification.
  • Figure 2: Identifying appropriate trust, from yang20.
  • Figure 3: Confusion trust matrix for user evaluations, inspired by yang20