Table of Contents
Fetching ...

The Impact of Explanations on Fairness in Human-AI Decision-Making: Protected vs Proxy Features

Navita Goyal, Connor Baumler, Tin Nguyen, Hal Daumé

TL;DR

The paper addresses how explanations and disclosures influence fairness in human-AI decision-making when biases arise directly from protected attributes or indirectly via proxies. It uses a micro-lending task with biased logistic regression models and manipulates explanations and two disclosure types across six conditions to measure fairness perception, demographic parity, and decision quality. Key findings show explanations aid detection of direct biases but can increase acceptance of biased decisions, while disclosures (especially about bias and proxy correlations) help recognize and mitigate indirect biases; however, the joint intervention often fails to consistently improve fairness, highlighting that explanations are not a universal solution. Practically, the work informs when to deploy explanations and disclosures to support fair human-AI collaboration and underscores the need for careful design to avoid over-reliance on biased AI systems.

Abstract

AI systems have been known to amplify biases in real-world data. Explanations may help human-AI teams address these biases for fairer decision-making. Typically, explanations focus on salient input features. If a model is biased against some protected group, explanations may include features that demonstrate this bias, but when biases are realized through proxy features, the relationship between this proxy feature and the protected one may be less clear to a human. In this work, we study the effect of the presence of protected and proxy features on participants' perception of model fairness and their ability to improve demographic parity over an AI alone. Further, we examine how different treatments -- explanations, model bias disclosure and proxy correlation disclosure -- affect fairness perception and parity. We find that explanations help people detect direct but not indirect biases. Additionally, regardless of bias type, explanations tend to increase agreement with model biases. Disclosures can help mitigate this effect for indirect biases, improving both unfairness recognition and decision-making fairness. We hope that our findings can help guide further research into advancing explanations in support of fair human-AI decision-making.

The Impact of Explanations on Fairness in Human-AI Decision-Making: Protected vs Proxy Features

TL;DR

The paper addresses how explanations and disclosures influence fairness in human-AI decision-making when biases arise directly from protected attributes or indirectly via proxies. It uses a micro-lending task with biased logistic regression models and manipulates explanations and two disclosure types across six conditions to measure fairness perception, demographic parity, and decision quality. Key findings show explanations aid detection of direct biases but can increase acceptance of biased decisions, while disclosures (especially about bias and proxy correlations) help recognize and mitigate indirect biases; however, the joint intervention often fails to consistently improve fairness, highlighting that explanations are not a universal solution. Practically, the work informs when to deploy explanations and disclosures to support fair human-AI collaboration and underscores the need for careful design to avoid over-reliance on biased AI systems.

Abstract

AI systems have been known to amplify biases in real-world data. Explanations may help human-AI teams address these biases for fairer decision-making. Typically, explanations focus on salient input features. If a model is biased against some protected group, explanations may include features that demonstrate this bias, but when biases are realized through proxy features, the relationship between this proxy feature and the protected one may be less clear to a human. In this work, we study the effect of the presence of protected and proxy features on participants' perception of model fairness and their ability to improve demographic parity over an AI alone. Further, we examine how different treatments -- explanations, model bias disclosure and proxy correlation disclosure -- affect fairness perception and parity. We find that explanations help people detect direct but not indirect biases. Additionally, regardless of bias type, explanations tend to increase agreement with model biases. Disclosures can help mitigate this effect for indirect biases, improving both unfairness recognition and decision-making fairness. We hope that our findings can help guide further research into advancing explanations in support of fair human-AI decision-making.
Paper Structure (43 sections, 7 equations, 16 figures, 6 tables)

This paper contains 43 sections, 7 equations, 16 figures, 6 tables.

Figures (16)

  • Figure 1: Summary of primary effects considered in our study. Participants are assigned to either with or without explanations conditions and then complete the study moving horizontally from phase 1 to phase 2. We then compare the results of different combinations of phases and explanation conditions to investigate the effects of explanations alone, disclosures without explanations, and disclosures with explanations.
  • Figure 2: Order of study phases.
  • Figure 3: Example profile with explanation from the "protected" model (left) without an explanation (right) and question to the user (below). The predicted outcome is completing the loan on time. The labels on the left show the name of each feature. The labels on the right show the value of each feature for the current applicant and the percent/percentile of this value in the training data. For the explanation, on the x-axis positive blue values correspond to "Complete" predictions and negative red to "Late". See Figure \ref{['fig:task']} in the Appendix for an example profile as shown in the study interface.
  • Figure 4: a) Bias disclosure. b) Full correlation disclosure. Proxy "no correlation disclosure" conditions include the top paragraph but with the example of a hiring system relying on the relationship between zip code and race. See Figure \ref{['fig:bias_disclosure']} and Figure \ref{['fig:corr_disclosure']} in the Appendix for how these disclosures are shown in the study interface.
  • Figure 5: Effect of explanations alone on various metrics when bias stems from usage of a protected vs proxy feature. The marks show the average and standard error of the given metric across participants in the given condition.
  • ...and 11 more figures