Human-in-the-loop Fairness: Integrating Stakeholder Feedback to Incorporate Fairness Perspectives in Responsible AI

Evdoxia Taka; Yuri Nakao; Ryosuke Sonoda; Takuya Yokota; Lin Luo; Simone Stumpf

Human-in-the-loop Fairness: Integrating Stakeholder Feedback to Incorporate Fairness Perspectives in Responsible AI

Evdoxia Taka, Yuri Nakao, Ryosuke Sonoda, Takuya Yokota, Lin Luo, Simone Stumpf

TL;DR

This work addresses the challenge that fairness in high-risk AI is context-dependent and metric-diverse by introducing a human-in-the-loop approach where lay users provide instance-level fairness feedback to retrain a credit-scoring model. Through two studies using the Home Credit dataset and an XGBoost classifier, the authors examine how global and personalized retraining informed by stakeholder input affects multiple fairness metrics, including $DPR$, $CDD$, $EOD$, $AOD$, $PPD$, and counterfactual fairness $CF$. Key findings show that user feedback can improve group fairness metrics like $DPR$ and $AOD$, especially when using weight-adjusted or unfair-label integrations, while accuracy may decline modestly; online interactive feedback (Study 2) demonstrates both potential gains and trade-offs in personalized models, and highlights usability challenges. The work contributes two open datasets, code frameworks, and empirical insights that guide the design of participatory, responsibility-driven AI systems, while outlining practical implications and future directions for broader adoption and improved stakeholder alignment with fairness notions.

Abstract

Fairness is a growing concern for high-risk decision-making using Artificial Intelligence (AI) but ensuring it through purely technical means is challenging: there is no universally accepted fairness measure, fairness is context-dependent, and there might be conflicting perspectives on what is considered fair. Thus, involving stakeholders, often without a background in AI or fairness, is a promising avenue. Research to directly involve stakeholders is in its infancy, and many questions remain on how to support stakeholders to feedback on fairness, and how this feedback can be integrated into AI models. Our work follows an approach where stakeholders can give feedback on specific decision instances and their outcomes with respect to their fairness, and then to retrain an AI model. In order to investigate this approach, we conducted two studies of a complex AI model for credit rating used in loan applications. In study 1, we collected feedback from 58 lay users on loan application decisions, and conducted offline experiments to investigate the effects on accuracy and fairness metrics. In study 2, we deepened this investigation by showing 66 participants the results of their feedback with respect to fairness, and then conducted further offline analyses. Our work contributes two datasets and associated code frameworks to bootstrap further research, highlights the opportunities and challenges of employing lay user feedback for improving AI fairness, and discusses practical implications for developing AI applications that more closely reflect stakeholder views about fairness.

Human-in-the-loop Fairness: Integrating Stakeholder Feedback to Incorporate Fairness Perspectives in Responsible AI

TL;DR

, and counterfactual fairness

. Key findings show that user feedback can improve group fairness metrics like

and

, especially when using weight-adjusted or unfair-label integrations, while accuracy may decline modestly; online interactive feedback (Study 2) demonstrates both potential gains and trade-offs in personalized models, and highlights usability challenges. The work contributes two open datasets, code frameworks, and empirical insights that guide the design of participatory, responsibility-driven AI systems, while outlining practical implications and future directions for broader adoption and improved stakeholder alignment with fairness notions.

Abstract

Paper Structure (37 sections, 7 figures, 8 tables)

This paper contains 37 sections, 7 figures, 8 tables.

Introduction
Related Work
Measuring Fairness and Bias Mitigation
Human-Centered Fairness
User Study 1
Method
The Dataset and AI Model
The User Interface
Participants.
Procedure.
Data Collected.
Offline Experiments
Results
Baseline
Effects of Integrating Feedback on Global Model Fairness
...and 22 more sections

Figures (7)

Figure 1: The UI used in user study 1 to collect participants' feedback on the fairness of the AI model's decisions on the outcome of loan applications. (a) System Overview. Dataset View: (b) application details list and similarity graph, (c) user feedback input modal showing when the "Decide" button of an application is clicked. Model View: (d) causal graph of the model's attributes and (e) details of the Installment node shown when the node is clicked. (f) Sorted list of acceptance rates for loan application groups shown below the causal graph and (g) the feature combinations of the group of applications having the lowest acceptance rate.
Figure 2: DPR of the protected attributes for one of the participants calculated based on the Personalized_Labels+Weights approach. Black lines show the baseline, blue lines show the raw values, and red lines the CMA values in each integration step. An integration step represents a single feedback instance integration. The green markers represent the CMA points we use to calculate the percentage changes from baseline. It shows that raw feedback (blue) is very noisy which can be smoothed out through the CMA (red) to determine a general pattern of feedback. This participant provided only 6 feedback instances.
Figure 3: Average values of DPR, EOD, and AOD for the personalized models of all participants in user study 1.
Figure 4: (a) The IML UI used in user study 2 to collect participants' feedback on the decisions of the AI model. (b) The modal window shown when participants click the "Decide" button of an application appearing on the Applications List. This modal window enables participants to provide their feedback on the fairness of the application's predicted outcome and the "weight" the model sets to each attribute in its decision-making.
Figure 5: Percentage change of DPR, AOD and CF from baseline per participant for the Personalized-Labels_Unfair+Weights approach, presented in bar graphs.
...and 2 more figures

Human-in-the-loop Fairness: Integrating Stakeholder Feedback to Incorporate Fairness Perspectives in Responsible AI

TL;DR

Abstract

Human-in-the-loop Fairness: Integrating Stakeholder Feedback to Incorporate Fairness Perspectives in Responsible AI

Authors

TL;DR

Abstract

Table of Contents

Figures (7)