FaiREE: Fair Classification with Finite-Sample and Distribution-Free Guarantee
Puheng Li, James Zou, Linjun Zhang
TL;DR
FaiREE tackles the challenge of enforcing group fairness in classification with guarantees that hold in finite samples and without distributional assumptions, via a post-processing approach. It scores a given classifier, constructs a candidate set of thresholds using order statistics and Beta-distributed bounds to ensure fairness constraints (e.g., $|DEOO(\phi)|\leq \alpha$) hold with high probability, and selects the threshold that minimizes the mis-classification error within feasibility. The method demonstrates theoretical guarantees for fairness and near-optimal accuracy, extends to multiple fairness notions including Equality of Opportunity and Equalized Odds, and shows strong empirical performance on synthetic and real datasets (e.g., Adult Census) against state-of-the-art baselines. The work offers a practically valuable, distribution-free tool for fair classification with clear guidance on sample-size requirements and applicability to various fairness constraints, enabling reliable deployment in real-world settings.
Abstract
Algorithmic fairness plays an increasingly critical role in machine learning research. Several group fairness notions and algorithms have been proposed. However, the fairness guarantee of existing fair classification methods mainly depends on specific data distributional assumptions, often requiring large sample sizes, and fairness could be violated when there is a modest number of samples, which is often the case in practice. In this paper, we propose FaiREE, a fair classification algorithm that can satisfy group fairness constraints with finite-sample and distribution-free theoretical guarantees. FaiREE can be adapted to satisfy various group fairness notions (e.g., Equality of Opportunity, Equalized Odds, Demographic Parity, etc.) and achieve the optimal accuracy. These theoretical guarantees are further supported by experiments on both synthetic and real data. FaiREE is shown to have favorable performance over state-of-the-art algorithms.
