Fair Decisions from Calibrated Scores: Achieving Optimal Classification While Satisfying Sufficiency
Etam Benger, Katrina Ligett
TL;DR
This work tackles binary classification under sufficiency with finite-valued, group-calibrated scores. It derives an exact geometric description of feasible ($p$, $q$) pairs, where $p=PPV(R)$ and $q=FOR(R)$, and shows how to post-process scores to achieve the optimal fair classifier using only group information. The authors characterize the intersection of group-feasible regions and provide a tractable boundary-tracing algorithm to optimize loss or minimize deviation from separation within that boundary. They demonstrate the approach on real data (COMPAS), producing group-specific thresholds that maintain sufficiency with competitive accuracy compared to the unconstrained optimum. The framework clarifies when sufficiency is achievable without abstention and how to balance fairness with separation, offering practical, exact tools for fair decision-making with calibrated scores.
Abstract
Binary classification based on predicted probabilities (scores) is a fundamental task in supervised machine learning. While thresholding scores is Bayes-optimal in the unconstrained setting, using a single threshold generally violates statistical group fairness constraints. Under independence (statistical parity) and separation (equalized odds), such thresholding suffices when the scores already satisfy the corresponding criterion. However, this does not extend to sufficiency: even perfectly group-calibrated scores -- including true class probabilities -- violate predictive parity after thresholding. In this work, we present an exact solution for optimal binary (randomized) classification under sufficiency, assuming finite sets of group-calibrated scores. We provide a geometric characterization of the feasible pairs of positive predictive value (PPV) and false omission rate (FOR) achievable by such classifiers, and use it to derive a simple post-processing algorithm that attains the optimal classifier using only group-calibrated scores and group membership. Finally, since sufficiency and separation are generally incompatible, we identify the classifier that minimizes deviation from separation subject to sufficiency, and show that it can also be obtained by our algorithm, often achieving performance comparable to the optimum.
