Table of Contents
Fetching ...

Towards Balancing Preference and Performance through Adaptive Personalized Explainability

Andrew Silva, Pradyumna Tambwekar, Mariah Schrum, Matthew Gombolay

TL;DR

The paper investigates how to balance the usefulness of explanations with user preferences in human-robot interaction by studying explainability modalities in a simulated autonomous-vehicle domain and proposing an adaptive personalization framework. It compares language explanations, feature-importance maps, and decision trees across population preferences and task performance, finding language explanations generally preferred and more effective, while noting modality-specific tradeoffs. Building on this, the authors develop an adaptive personalization approach that combines user preferences and task performance via sampling distributions and a neural predictor to tune explanations in real time. The empirical results show that balanced personalization improves performance relative to non-adaptive or random explanations and is not worse than using the best-known modality, suggesting a practical pathway to deploy personalized xAI in real-world HRI. The work highlights the importance of considering both user preferences and task success, offering design principles and a concrete adaptive mechanism with broad applicability in explainable AI for human-robot collaboration.

Abstract

As robots and digital assistants are deployed in the real world, these agents must be able to communicate their decision-making criteria to build trust, improve human-robot teaming, and enable collaboration. While the field of explainable artificial intelligence (xAI) has made great strides to enable such communication, these advances often assume that one xAI approach is ideally suited to each problem (e.g., decision trees to explain how to triage patients in an emergency or feature-importance maps to explain radiology reports). This fails to recognize that users have diverse experiences or preferences for interaction modalities. In this work, we present two user-studies set in a simulated autonomous vehicle (AV) domain. We investigate (1) population-level preferences for xAI and (2) personalization strategies for providing robot explanations. We find significant differences between xAI modes (language explanations, feature-importance maps, and decision trees) in both preference (p < 0.01) and performance (p < 0.05). We also observe that a participant's preferences do not always align with their performance, motivating our development of an adaptive personalization strategy to balance the two. We show that this strategy yields significant performance gains (p < 0.05), and we conclude with a discussion of our findings and implications for xAI in human-robot interactions.

Towards Balancing Preference and Performance through Adaptive Personalized Explainability

TL;DR

The paper investigates how to balance the usefulness of explanations with user preferences in human-robot interaction by studying explainability modalities in a simulated autonomous-vehicle domain and proposing an adaptive personalization framework. It compares language explanations, feature-importance maps, and decision trees across population preferences and task performance, finding language explanations generally preferred and more effective, while noting modality-specific tradeoffs. Building on this, the authors develop an adaptive personalization approach that combines user preferences and task performance via sampling distributions and a neural predictor to tune explanations in real time. The empirical results show that balanced personalization improves performance relative to non-adaptive or random explanations and is not worse than using the best-known modality, suggesting a practical pathway to deploy personalized xAI in real-world HRI. The work highlights the importance of considering both user preferences and task success, offering design principles and a concrete adaptive mechanism with broad applicability in explainable AI for human-robot collaboration.

Abstract

As robots and digital assistants are deployed in the real world, these agents must be able to communicate their decision-making criteria to build trust, improve human-robot teaming, and enable collaboration. While the field of explainable artificial intelligence (xAI) has made great strides to enable such communication, these advances often assume that one xAI approach is ideally suited to each problem (e.g., decision trees to explain how to triage patients in an emergency or feature-importance maps to explain radiology reports). This fails to recognize that users have diverse experiences or preferences for interaction modalities. In this work, we present two user-studies set in a simulated autonomous vehicle (AV) domain. We investigate (1) population-level preferences for xAI and (2) personalization strategies for providing robot explanations. We find significant differences between xAI modes (language explanations, feature-importance maps, and decision trees) in both preference (p < 0.01) and performance (p < 0.05). We also observe that a participant's preferences do not always align with their performance, motivating our development of an adaptive personalization strategy to balance the two. We show that this strategy yields significant performance gains (p < 0.05), and we conclude with a discussion of our findings and implications for xAI in human-robot interactions.

Paper Structure

This paper contains 39 sections, 2 equations, 19 figures.

Figures (19)

  • Figure 1: Here we show an example interaction with a language explanation and a correct suggestion. Taking a wrong turn (e.g., going straight) will lead directly into a roadblock, forcing participants return to this intersection and repeat the interaction.
  • Figure 2: We compare three xAI modalities in this work: feature-importance maps, (top left) in which highlighted regions indicate possible directions and relevant elements of the image, such as green indicating the suggested direction, language explanations (bottom left) that are a sentence justifying one direction over another, and decision trees (right) in which the highlighted path leads to the suggested direction. Red blocks mean "false" and blue blocks mean "true".
  • Figure 3: Visualized results from the population user study between decision trees, feature-importance maps, and language explanations. (a) Feature maps lead to significantly increased inappropriate compliance. (b) Both feature maps and language explanations lead to more consecutive mistakes (quantities are normalized by total number of mistakes). (c) Language is significantly preferred over decision trees and feature maps. (d) Decision trees are slower to parse (measured in seconds).
  • Figure 4: Comparisons relative to balanced-personalization. (Left) Percent of participant preferences for personalization modes, showing significant preference for balanced personalization over no personalization. (Right) Rates of inappropriate compliance, showing that balanced-personalization leads to significantly lower inappropriate compliance than preference-maximization or no personalization.
  • Figure 5: We conduct two user studies, beginning with a population study (left) in which all participants work with three xAI modalities, revealing significant differences across the population. We then use this data to build an adaptive personalization model that is deployed in a set of personalization studies to compare various personalization approaches.
  • ...and 14 more figures