Procedural Fairness and Its Relationship with Distributive Fairness in Machine Learning
Ziming Wang, Changwu Huang, Ke Tang, Xin Yao
TL;DR
This work addresses the gap between procedural fairness and distributive fairness in ML by introducing an in-training objective based on $GPFFAE$ to enforce fair decision logic. It replaces costly SHAP explanations with a gradient-based FAE and adds a regularization term to the loss, yielding procedurally fair models that also improve distributive fairness across seven datasets with only ~ $0.9\%$ average accuracy loss. The paper also investigates how inherent dataset bias and procedural fairness influence distributive fairness, showing that data bias and model-era fairness can either compound or cancel biases in outcomes, and that procedural fairness optimization generally reduces unfairness at the source. A key takeaway is that attaining distributive fairness via data debiasing plus procedural fairness training offers a root-cause solution, while optimizing distributive fairness alone can produce fair outcomes but leave underlying processes or data biased. The findings have practical implications for designing fair ML pipelines and motivate future multi-objective optimization to balance accuracy, procedural fairness, and distributive fairness.
Abstract
Fairness in machine learning (ML) has garnered significant attention in recent years. While existing research has predominantly focused on the distributive fairness of ML models, there has been limited exploration of procedural fairness. This paper proposes a novel method to achieve procedural fairness during the model training phase. The effectiveness of the proposed method is validated through experiments conducted on one synthetic and six real-world datasets. Additionally, this work studies the relationship between procedural fairness and distributive fairness in ML models. On one hand, the impact of dataset bias and the procedural fairness of ML model on its distributive fairness is examined. The results highlight a significant influence of both dataset bias and procedural fairness on distributive fairness. On the other hand, the distinctions between optimizing procedural and distributive fairness metrics are analyzed. Experimental results demonstrate that optimizing procedural fairness metrics mitigates biases introduced or amplified by the decision-making process, thereby ensuring fairness in the decision-making process itself, as well as improving distributive fairness. In contrast, optimizing distributive fairness metrics encourages the ML model's decision-making process to favor disadvantaged groups, counterbalancing the inherent preferences for advantaged groups present in the dataset and ultimately achieving distributive fairness.
