On the Definition of Appropriate Trust and the Tools that Come with it
Helena Löfström
TL;DR
The paper tackles the challenge of evaluating explanation quality in XAI by reframing ‘appropriate trust’ as a performance-like metric for human users. It draws strong parallels between user trust and model performance, and introduces a rigorous framework that uses confusion-matrix–based metrics (precision, recall, F1) to quantify misuse, disuse, and overall appropriate trust (U_at). The authors extend the approach to regression via conformal-regression–style uncertainty and discuss incorporating user uncertainty through probability-estimate intervals (e.g., Venn-Abers). They illustrate the method with a real-world data example and argue that an objective, comparable metric for user evaluations enables robust comparative studies of explanation methods and their impact on trust calibration, with practical implications for responsible AI deployment.
Abstract
Evaluating the efficiency of human-AI interactions is challenging, including subjective and objective quality aspects. With the focus on the human experience of the explanations, evaluations of explanation methods have become mostly subjective, making comparative evaluations almost impossible and highly linked to the individual user. However, it is commonly agreed that one aspect of explanation quality is how effectively the user can detect if the predictions are trustworthy and correct, i.e., if the explanations can increase the user's appropriate trust in the model. This paper starts with the definitions of appropriate trust from the literature. It compares the definitions with model performance evaluation, showing the strong similarities between appropriate trust and model performance evaluation. The paper's main contribution is a novel approach to evaluating appropriate trust by taking advantage of the likenesses between definitions. The paper offers several straightforward evaluation methods for different aspects of user performance, including suggesting a method for measuring uncertainty and appropriate trust in regression.
