Privilege Scores
Ludwig Bothmann, Philip A. Boustani, Jose M. Alvarez, Giuseppe Casalicchio, Bernd Bischl, Susanne Dandl
TL;DR
This work defines privilege scores (PS) to quantify PA-related privilege by comparing real-world decision outcomes to a FiND World baseline where PA has no causal effect, with the key metric $\delta = \pi(\mathbf{x}) - \psi(\mathbf{x}_F)$. It develops an estimation framework (including uncertainty quantification via bootstrapping) and introduces Privilege Score Contributions (PSCs) to attribute PS to mediators via Shapley values, along with partial warping to handle complex causal paths. The approach is instantiated through two warping methods (e.g., fairadapt and residual-based warping) and evaluated on simulated data and real-world HMDA mortgage data and law-school data, revealing interpretable, policy-relevant drivers of racial and gender privilege. These contributions enable both individualized and population-level auditing of bias-transforming fair ML, with concrete guidance for mitigating PA-related discrimination and informing affirmative-action policy design.
Abstract
Bias-transforming methods of fairness-aware machine learning aim to correct a non-neutral status quo with respect to a protected attribute (PA). Current methods, however, lack an explicit formulation of what drives non-neutrality. We introduce privilege scores (PS) to measure PA-related privilege by comparing the model predictions in the real world with those in a fair world in which the influence of the PA is removed. At the individual level, PS can identify individuals who qualify for affirmative action; at the global level, PS can inform bias-transforming policies. After presenting estimation methods for PS, we propose privilege score contributions (PSCs), an interpretation method that attributes the origin of privilege to mediating features and direct effects. We provide confidence intervals for both PS and PSCs. Experiments on simulated and real-world data demonstrate the broad applicability of our methods and provide novel insights into gender and racial privilege in mortgage and college admissions applications.
