FALE: Fairness-Aware ALE Plots for Auditing Bias in Subgroups
Giorgos Giannopoulos, Dimitris Sacharidis, Nikolas Theologitis, Loukas Kavouras, Ioannis Emiris
TL;DR
Subgroup fairness remains challenging to audit and explain. The paper introduces FALE plots, which extend accumulated local effects to measure how feature values influence subgroup unfairness under a selected fairness notion and sensitive attribute. FALE(x_i) aggregates fairness differences across bins to produce a visual, population-informed summary of bias shifts. The authors demonstrate FALE on the Adult dataset with an XGBoost classifier, showing how bias against the sex attribute varies across age, education, and work hours, and argue that FALE provides a practical first-step tool for practitioners to identify problematic subgroups.
Abstract
Fairness is steadily becoming a crucial requirement of Machine Learning (ML) systems. A particularly important notion is subgroup fairness, i.e., fairness in subgroups of individuals that are defined by more than one attributes. Identifying bias in subgroups can become both computationally challenging, as well as problematic with respect to comprehensibility and intuitiveness of the finding to end users. In this work we focus on the latter aspects; we propose an explainability method tailored to identifying potential bias in subgroups and visualizing the findings in a user friendly manner to end users. In particular, we extend the ALE plots explainability method, proposing FALE (Fairness aware Accumulated Local Effects) plots, a method for measuring the change in fairness for an affected population corresponding to different values of a feature (attribute). We envision FALE to function as an efficient, user friendly, comprehensible and reliable first-stage tool for identifying subgroups with potential bias issues.
