Explanatory Debiasing: Involving Domain Experts in the Data Generation Process to Mitigate Representation Bias in AI Systems
Aditya Bhattacharya, Simone Stumpf, Robin De Croon, Katrien Verbert
TL;DR
This paper tackles representation bias in AI by introducing generic design guidelines for involving domain experts in the data generation and augmentation process. It demonstrates the guidelines through a healthcare-focused prototype and a mixed-methods study with 35 healthcare professionals, showing reduced representation bias without sacrificing model accuracy and increased expert trust. The work contributes a structured, evidence-based framework (pre-/during-/post-augmentation) and actionable UI and process guidelines, plus open-source artifacts for replication. It highlights the complementary role of domain experts to AI experts in debiasing, with implications for more reliable and fair AI systems in high-stakes domains.
Abstract
Representation bias is one of the most common types of biases in artificial intelligence (AI) systems, causing AI models to perform poorly on underrepresented data segments. Although AI practitioners use various methods to reduce representation bias, their effectiveness is often constrained by insufficient domain knowledge in the debiasing process. To address this gap, this paper introduces a set of generic design guidelines for effectively involving domain experts in representation debiasing. We instantiated our proposed guidelines in a healthcare-focused application and evaluated them through a comprehensive mixed-methods user study with 35 healthcare experts. Our findings show that involving domain experts can reduce representation bias without compromising model accuracy. Based on our findings, we also offer recommendations for developers to build robust debiasing systems guided by our generic design guidelines, ensuring more effective inclusion of domain experts in the debiasing process.
