Table of Contents
Fetching ...

Fair Decisions from Calibrated Scores: Achieving Optimal Classification While Satisfying Sufficiency

Etam Benger, Katrina Ligett

TL;DR

This work tackles binary classification under sufficiency with finite-valued, group-calibrated scores. It derives an exact geometric description of feasible ($p$, $q$) pairs, where $p=PPV(R)$ and $q=FOR(R)$, and shows how to post-process scores to achieve the optimal fair classifier using only group information. The authors characterize the intersection of group-feasible regions and provide a tractable boundary-tracing algorithm to optimize loss or minimize deviation from separation within that boundary. They demonstrate the approach on real data (COMPAS), producing group-specific thresholds that maintain sufficiency with competitive accuracy compared to the unconstrained optimum. The framework clarifies when sufficiency is achievable without abstention and how to balance fairness with separation, offering practical, exact tools for fair decision-making with calibrated scores.

Abstract

Binary classification based on predicted probabilities (scores) is a fundamental task in supervised machine learning. While thresholding scores is Bayes-optimal in the unconstrained setting, using a single threshold generally violates statistical group fairness constraints. Under independence (statistical parity) and separation (equalized odds), such thresholding suffices when the scores already satisfy the corresponding criterion. However, this does not extend to sufficiency: even perfectly group-calibrated scores -- including true class probabilities -- violate predictive parity after thresholding. In this work, we present an exact solution for optimal binary (randomized) classification under sufficiency, assuming finite sets of group-calibrated scores. We provide a geometric characterization of the feasible pairs of positive predictive value (PPV) and false omission rate (FOR) achievable by such classifiers, and use it to derive a simple post-processing algorithm that attains the optimal classifier using only group-calibrated scores and group membership. Finally, since sufficiency and separation are generally incompatible, we identify the classifier that minimizes deviation from separation subject to sufficiency, and show that it can also be obtained by our algorithm, often achieving performance comparable to the optimum.

Fair Decisions from Calibrated Scores: Achieving Optimal Classification While Satisfying Sufficiency

TL;DR

This work tackles binary classification under sufficiency with finite-valued, group-calibrated scores. It derives an exact geometric description of feasible (, ) pairs, where and , and shows how to post-process scores to achieve the optimal fair classifier using only group information. The authors characterize the intersection of group-feasible regions and provide a tractable boundary-tracing algorithm to optimize loss or minimize deviation from separation within that boundary. They demonstrate the approach on real data (COMPAS), producing group-specific thresholds that maintain sufficiency with competitive accuracy compared to the unconstrained optimum. The framework clarifies when sufficiency is achievable without abstention and how to balance fairness with separation, offering practical, exact tools for fair decision-making with calibrated scores.

Abstract

Binary classification based on predicted probabilities (scores) is a fundamental task in supervised machine learning. While thresholding scores is Bayes-optimal in the unconstrained setting, using a single threshold generally violates statistical group fairness constraints. Under independence (statistical parity) and separation (equalized odds), such thresholding suffices when the scores already satisfy the corresponding criterion. However, this does not extend to sufficiency: even perfectly group-calibrated scores -- including true class probabilities -- violate predictive parity after thresholding. In this work, we present an exact solution for optimal binary (randomized) classification under sufficiency, assuming finite sets of group-calibrated scores. We provide a geometric characterization of the feasible pairs of positive predictive value (PPV) and false omission rate (FOR) achievable by such classifiers, and use it to derive a simple post-processing algorithm that attains the optimal classifier using only group-calibrated scores and group membership. Finally, since sufficiency and separation are generally incompatible, we identify the classifier that minimizes deviation from separation subject to sufficiency, and show that it can also be obtained by our algorithm, often achieving performance comparable to the optimum.
Paper Structure (23 sections, 6 theorems, 79 equations, 3 figures, 2 algorithms)

This paper contains 23 sections, 6 theorems, 79 equations, 3 figures, 2 algorithms.

Key Result

Theorem 3.3

Let $0<\mu<1$ and define $k^*=k^*(\mu):=\min\{\,k\mid \sum_{i\leq k}P(s_i)\geq\mu\,\}$. Then the pair $(p^*(\mu), q^*(\mu))$ is attained by $R^*=R^*(S;\mu)$, defined according to

Figures (3)

  • Figure 1: Schematic of the feasible region $\mathcal{C}$. The shaded region depicts a typical feasible region $\mathcal{C}$ corresponding to $m=5$ distinct score values; the blue curve denotes its nontrivial boundary $\partial\mathcal{C}$. For a fixed selection rate $\mu$, all feasible pairs $(p,q)$ lie on the red line segment connecting $(\pi,\pi)$ to $(p^*(\mu),q^*(\mu))$. The boundary is piecewise over intervals $J_k$ on the $p$-axis; for illustration, only $J_m$ and $J_{m-1}$ are shown. Among the $m$ breakpoints of the boundary, the point $(p_2,q_2)$ is shown as an example and corresponds to the deterministic threshold classifier $R^*(S;\mu_2)$.
  • Figure 2: Intersection of feasible regions. Two examples of intersections between $\mathcal{C}^0$ (blue) and $\mathcal{C}^1$ (orange); the green curve denotes the boundary $\partial(\mathcal{C}^0\cap\mathcal{C}^1)$ of the intersection. Top: the group-wise boundaries intersect four times. Bottom: the group-wise boundaries intersect twice; in particular, $\partial(\mathcal{C}^0\cap\mathcal{C}^1)$ contains no breakpoints of either group, corresponding to deterministic threshold classifiers. In both cases, assuming $P(A=1)=\tfrac{1}{2}$, the point on $\partial(\mathcal{C}^0\cap\mathcal{C}^1)$ that maximizes overall accuracy (equivalently, minimizes the $0$--$1$ loss) is marked by a black cross. The point that minimizes the deviation from separation, $\Delta_\mathrm{sep}(R)$, is marked by a black $\times$. In the top example, the two optima coincide, whereas in the bottom example the accuracy- and deviation-minimizing points occur at distinct locations along the boundary.
  • Figure 3: COMPAS dataset feasible regions. The dots on the boundaries of the feasible regions correspond to deterministic threshold classifiers, with the leftmost dot representing the rule $R=1$ iff $S\geq 1$ and the rightmost dot $R=1$ iff $S\geq 10$. The point that maximizes overall accuracy (marked by a cross) lies almost exactly on the seventh breakpoint of the Caucasian group's feasible region $\mathcal{C}^0$ (blue) and the sixth breakpoint of the non-Caucasian group's feasible region $\mathcal{C}^1$ (orange).

Theorems & Definitions (17)

  • Definition 2.1: Sufficiency
  • Definition 2.2: Calibration and group-calibration
  • Definition 2.3: Predictive parity
  • Definition 3.1: Feasibility
  • Definition 3.2
  • Theorem 3.3
  • Theorem 3.4
  • Definition 4.1: Subgroup-Feasibility
  • Theorem \ref{thm:ttop}: Restated
  • proof
  • ...and 7 more