Inclusive Design of AI's Explanations: Just for Those Previously Left Out, or for Everyone?
Md Montaser Hamid, Fatima Moussaoui, Jimena Noa Guevara, Andrew Anderson, Puja Agarwal, Jonathan Dodge, Margaret Burnett
TL;DR
This study investigates whether applying GenderMag-driven inclusive design to Explainable AI (XAI) explanations yields curb-cut effects—benefits for underserved users and for all users. Using a between-subjects design with two MNK game prototypes, the authors measure mental model concepts, prediction accuracy, and inclusivity in AI explanations among AI-naïve participants. They find that inclusive fixes improve overall mental-model understanding and explanation engagement (a curb-cut effect) but do not consistently improve, and can impair, prediction accuracy (a curb-fence effect). The improvements are strongest for Abi-like problem-solvers and women, reducing gender gaps, though the work cautions about potential overreliance on explanations and calls for careful deployment of inclusive XAI design.
Abstract
Motivations: Explainable Artificial Intelligence (XAI) systems aim to improve users' understanding of AI, but XAI research shows many cases of different explanations serving some users well and being unhelpful to others. In non-AI systems, some software practitioners have used inclusive design approaches and sometimes their improvements turned out to be "curb-cut" improvements -- not only addressing the needs of underserved users, but also making the products better for everyone. So, if AI practitioners used inclusive design approaches, they too might create curb-cut improvements, i.e., better explanations for everyone. Objectives: To find out, we investigated the curb-cut effects of inclusivity-driven fixes on users' mental models of AI when using an XAI prototype. The prototype and fixes came from an AI team who had adopted an inclusive design approach (GenderMag) to improve their XAI prototype. Methods: We ran a between-subject study with 69 participants with no AI background. 34 participants used the original version of the XAI prototype and 35 used the version with the inclusivity fixes. We compared the two groups' mental model concepts scores, prediction accuracy, and inclusivity. Results: We found four main results. First, it revealed several curb-cut effects of the inclusivity fixes: overall increased engagement with explanations and better mental model concepts scores, which revealed fixes with curb-cut properties. However (second), the inclusivity fixes did not improve participants' prediction accuracy scores -- instead, it appears to have harmed them. This "curb-fence" effect (opposite of the curb-cut effect) revealed the AI explanations' double-edged impact. Third, the AI team's inclusivity fixes brought significant improvements for users whose problem-solving styles had previously been underserved. Further (fourth), the AI team's fixes reduced the gender gap by 45%.
