Table of Contents
Fetching ...

Explanations, Fairness, and Appropriate Reliance in Human-AI Decision-Making

Jakob Schoeffer, Maria De-Arteaga, Niklas Kuehl

TL;DR

It is seen that feature-based explanations do not enable humans to discern correct and incorrect AI recommendations, and it is shown that they may affect reliance irrespective of the correctness of AI recommendations.

Abstract

In this work, we study the effects of feature-based explanations on distributive fairness of AI-assisted decisions, specifically focusing on the task of predicting occupations from short textual bios. We also investigate how any effects are mediated by humans' fairness perceptions and their reliance on AI recommendations. Our findings show that explanations influence fairness perceptions, which, in turn, relate to humans' tendency to adhere to AI recommendations. However, we see that such explanations do not enable humans to discern correct and incorrect AI recommendations. Instead, we show that they may affect reliance irrespective of the correctness of AI recommendations. Depending on which features an explanation highlights, this can foster or hinder distributive fairness: when explanations highlight features that are task-irrelevant and evidently associated with the sensitive attribute, this prompts overrides that counter AI recommendations that align with gender stereotypes. Meanwhile, if explanations appear task-relevant, this induces reliance behavior that reinforces stereotype-aligned errors. These results imply that feature-based explanations are not a reliable mechanism to improve distributive fairness.

Explanations, Fairness, and Appropriate Reliance in Human-AI Decision-Making

TL;DR

It is seen that feature-based explanations do not enable humans to discern correct and incorrect AI recommendations, and it is shown that they may affect reliance irrespective of the correctness of AI recommendations.

Abstract

In this work, we study the effects of feature-based explanations on distributive fairness of AI-assisted decisions, specifically focusing on the task of predicting occupations from short textual bios. We also investigate how any effects are mediated by humans' fairness perceptions and their reliance on AI recommendations. Our findings show that explanations influence fairness perceptions, which, in turn, relate to humans' tendency to adhere to AI recommendations. However, we see that such explanations do not enable humans to discern correct and incorrect AI recommendations. Instead, we show that they may affect reliance irrespective of the correctness of AI recommendations. Depending on which features an explanation highlights, this can foster or hinder distributive fairness: when explanations highlight features that are task-irrelevant and evidently associated with the sensitive attribute, this prompts overrides that counter AI recommendations that align with gender stereotypes. Meanwhile, if explanations appear task-relevant, this induces reliance behavior that reinforces stereotype-aligned errors. These results imply that feature-based explanations are not a reliable mechanism to improve distributive fairness.
Paper Structure (60 sections, 1 equation, 17 figures, 2 tables)

This paper contains 60 sections, 1 equation, 17 figures, 2 tables.

Figures (17)

  • Figure 2: Study participants are randomly assigned to one of three conditions. In each condition, they first complete the task of predicting occupations from 14 short bios, and complete a demographic survey. In the conditions with explanations (Task-relevant and Gendered), participants are also asked about their fairness perceptions after completing the task.
  • Figure 3: Accuracy is not higher when explanations are provided, compared to the baseline.
  • Figure 4: Overrides are highest in the gendered condition.
  • Figure 5: Explanations do not enable corrective vs. detrimental overrides.
  • Figure 6: Explanations (middle and right) do not increase accuracy over the baseline, neither for men nor women bios.
  • ...and 12 more figures