Table of Contents
Fetching ...

Accuracy vs. Accuracy: Computational Tradeoffs Between Classification Rates and Utility

Noga Amit, Omer Reingold, Guy N. Rothblum

TL;DR

This paper investigates fairness in prediction when labels carry rich information beyond binary outcomes, introducing multi-accuracy and multi-calibration as foundations for aligning predictions with the Bayes-optimal distribution $p^*$. It distinguishes how decision rules interact with these notions, showing that affine rules can preserve decision-accuracy under MA/MC, while non-affine rules can render calibration and decision-accuracy computationally incompatible. The authors prove strong impossibility results for jointly achieving calibration and decision/classification accuracy in hard settings, yet provide positive results by showing that each desideratum can be achieved separately via calibration and post-processing, with connections to omnipredictors. They also develop relaxations of multi-calibration to enable efficient learning and scalable loss minimization, employing both cryptographic and information-theoretic indistinguishability frameworks and an Outcome Indistinguishability approach to train predictors robust to multiple loss functions. Overall, the work clarifies the tradeoffs between accuracy of predictions, accuracy of decisions, and loss minimization under fairness constraints, providing practical deployment options based on calibration-driven post-processing.

Abstract

We revisit the foundations of fairness and its interplay with utility and efficiency in settings where the training data contain richer labels, such as individual types, rankings, or risk estimates, rather than just binary outcomes. In this context, we propose algorithms that achieve stronger notions of evidence-based fairness than are possible in standard supervised learning. Our methods support classification and ranking techniques that preserve accurate subpopulation classification rates, as suggested by the underlying data distributions, across a broad class of classification rules and downstream applications. Furthermore, our predictors enable loss minimization, whether aimed at maximizing utility or in the service of fair treatment. Complementing our algorithmic contributions, we present impossibility results demonstrating that simultaneously achieving accurate classification rates and optimal loss minimization is, in some cases, computationally infeasible. Unlike prior impossibility results, our notions are not inherently in conflict and are simultaneously satisfied by the Bayes-optimal predictor. Furthermore, we show that each notion can be satisfied individually via efficient learning. Our separation thus stems from the computational hardness of learning a sufficiently good approximation of the Bayes-optimal predictor. These computational impossibilities present a choice between two natural and attainable notions of accuracy that could both be motivated by fairness.

Accuracy vs. Accuracy: Computational Tradeoffs Between Classification Rates and Utility

TL;DR

This paper investigates fairness in prediction when labels carry rich information beyond binary outcomes, introducing multi-accuracy and multi-calibration as foundations for aligning predictions with the Bayes-optimal distribution . It distinguishes how decision rules interact with these notions, showing that affine rules can preserve decision-accuracy under MA/MC, while non-affine rules can render calibration and decision-accuracy computationally incompatible. The authors prove strong impossibility results for jointly achieving calibration and decision/classification accuracy in hard settings, yet provide positive results by showing that each desideratum can be achieved separately via calibration and post-processing, with connections to omnipredictors. They also develop relaxations of multi-calibration to enable efficient learning and scalable loss minimization, employing both cryptographic and information-theoretic indistinguishability frameworks and an Outcome Indistinguishability approach to train predictors robust to multiple loss functions. Overall, the work clarifies the tradeoffs between accuracy of predictions, accuracy of decisions, and loss minimization under fairness constraints, providing practical deployment options based on calibration-driven post-processing.

Abstract

We revisit the foundations of fairness and its interplay with utility and efficiency in settings where the training data contain richer labels, such as individual types, rankings, or risk estimates, rather than just binary outcomes. In this context, we propose algorithms that achieve stronger notions of evidence-based fairness than are possible in standard supervised learning. Our methods support classification and ranking techniques that preserve accurate subpopulation classification rates, as suggested by the underlying data distributions, across a broad class of classification rules and downstream applications. Furthermore, our predictors enable loss minimization, whether aimed at maximizing utility or in the service of fair treatment. Complementing our algorithmic contributions, we present impossibility results demonstrating that simultaneously achieving accurate classification rates and optimal loss minimization is, in some cases, computationally infeasible. Unlike prior impossibility results, our notions are not inherently in conflict and are simultaneously satisfied by the Bayes-optimal predictor. Furthermore, we show that each notion can be satisfied individually via efficient learning. Our separation thus stems from the computational hardness of learning a sufficiently good approximation of the Bayes-optimal predictor. These computational impossibilities present a choice between two natural and attainable notions of accuracy that could both be motivated by fairness.

Paper Structure

This paper contains 49 sections, 27 theorems, 224 equations, 2 figures.

Key Result

Theorem 1.1

Let $\rho : \mathcal{Y} \to \{0,1\}$ be an affine decision rule. If a predictor $\tilde{p}$ is multi-accurate w.r.t a collection of groups $\mathcal{C}$, then it is also decision-accurate for every group in $\mathcal{C}$.

Figures (2)

  • Figure 1: Positive and negative results. MA: Multi-Accuracy, MC: Multi-Calibration, MAD: Multi-Accuracy-on-Decision, MAC: Multi-Accuracy-on-Classification. The top row contains properties defined over the predictor; the bottom row, over the action function obtained by applying a decision rule. A green arrow from $A$ to $B$ means that satisfying $A$ implies satisfying $B$. A green dashed arrow indicates that applying a decision rule to a predictor satisfying $A$ yields an action function satisfying $B$. A red arrow between $A$ and $B$ means that achieving both simultaneously is computationally hard. The implication marked $(*)$ holds for affine decision rules; the impossibility marked $(**)$ holds for rules that are far from affine.
  • Figure 2: Choice in how to use a predictor. While a single predictor may enable both loss minimization and MAC, these goals cannot be achieved simultaneously. This leads to two distinct paths: applying the loss-minimizing decision rule to the full distribution $\tilde{p}$ to directly minimize loss (left), or first sampling from $\tilde{p}$ and then applying the decision rule to achieve MAC (right).

Theorems & Definitions (97)

  • Theorem 1.1: Informal, see Theorem \ref{['thm:MA_implies_MAD']}
  • Theorem 1.2: Informal, see Theorem \ref{['thm:impossibility_result']}
  • proof : Proof overview for \ref{['thm:non-affine-negative']}.
  • Theorem 1.3: Informal, see Theorem \ref{['thm:impossibility_result_of_loss']}
  • proof : Proof overview for \ref{['thm:MAC-lossMin-negative']}
  • Theorem 1.4: Loss Minimization via Multi-Calibration, Informal Statement of \ref{['thm:MC_for_loss_mini']}
  • Definition 2.1: Stochastic Vectors
  • Definition 2.2: Coordinate-Wise-Multi-Accuracy for Probabilistic Predictors
  • Definition 2.3: Threshold-Multi-Accuracy for Probabilistic Rankings
  • Remark 2.4
  • ...and 87 more