Human-in-the-loop Fairness: Integrating Stakeholder Feedback to Incorporate Fairness Perspectives in Responsible AI
Evdoxia Taka, Yuri Nakao, Ryosuke Sonoda, Takuya Yokota, Lin Luo, Simone Stumpf
TL;DR
This work addresses the challenge that fairness in high-risk AI is context-dependent and metric-diverse by introducing a human-in-the-loop approach where lay users provide instance-level fairness feedback to retrain a credit-scoring model. Through two studies using the Home Credit dataset and an XGBoost classifier, the authors examine how global and personalized retraining informed by stakeholder input affects multiple fairness metrics, including $DPR$, $CDD$, $EOD$, $AOD$, $PPD$, and counterfactual fairness $CF$. Key findings show that user feedback can improve group fairness metrics like $DPR$ and $AOD$, especially when using weight-adjusted or unfair-label integrations, while accuracy may decline modestly; online interactive feedback (Study 2) demonstrates both potential gains and trade-offs in personalized models, and highlights usability challenges. The work contributes two open datasets, code frameworks, and empirical insights that guide the design of participatory, responsibility-driven AI systems, while outlining practical implications and future directions for broader adoption and improved stakeholder alignment with fairness notions.
Abstract
Fairness is a growing concern for high-risk decision-making using Artificial Intelligence (AI) but ensuring it through purely technical means is challenging: there is no universally accepted fairness measure, fairness is context-dependent, and there might be conflicting perspectives on what is considered fair. Thus, involving stakeholders, often without a background in AI or fairness, is a promising avenue. Research to directly involve stakeholders is in its infancy, and many questions remain on how to support stakeholders to feedback on fairness, and how this feedback can be integrated into AI models. Our work follows an approach where stakeholders can give feedback on specific decision instances and their outcomes with respect to their fairness, and then to retrain an AI model. In order to investigate this approach, we conducted two studies of a complex AI model for credit rating used in loan applications. In study 1, we collected feedback from 58 lay users on loan application decisions, and conducted offline experiments to investigate the effects on accuracy and fairness metrics. In study 2, we deepened this investigation by showing 66 participants the results of their feedback with respect to fairness, and then conducted further offline analyses. Our work contributes two datasets and associated code frameworks to bootstrap further research, highlights the opportunities and challenges of employing lay user feedback for improving AI fairness, and discusses practical implications for developing AI applications that more closely reflect stakeholder views about fairness.
