Table of Contents
Fetching ...

Cherry on the Cake: Fairness is NOT an Optimization Problem

Marco Favier, Toon Calders

TL;DR

The paper investigates cherry-picking as a potential consequence of fairness optimization in machine learning. By linking ML classification to cake-cutting, it shows that optimizing group-fairness constraints can inherently produce unfair within-group selections under many common fairness notions. It proves general theorems describing when non-cherry-picking solutions exist and when cherry-picking becomes inevitable, extends classical cake-cutting results to ML via the IPS framework, and discusses implications for practice. The work cautions against treating fairness as a pure optimization problem and advocates calibration and post-processing as more transparent routes to fair outcomes, while outlining future research directions and limitations. The findings have implications for the design of fair algorithms and for policy discussions about how to implement fairness in real systems.

Abstract

In Fair AI literature, the practice of maliciously creating unfair models that nevertheless satisfy fairness constraints is known as "cherry-picking". A cherry-picking model is a model that makes mistakes on purpose, selecting bad individuals from a minority class instead of better candidates from the same minority. The model literally cherry-picks whom to select to superficially meet the fairness constraints while making minimal changes to the unfair model. This practice has been described as "blatantly unfair" and has a negative impact on already marginalized communities, undermining the intended purpose of fairness measures specifically designed to protect these communities. A common assumption is that cherry-picking arises solely from malicious intent and that models designed only to optimize fairness metrics would avoid this behavior. We show that this is not the case: models optimized to minimize fairness metrics while maximizing performance are often forced to cherry-pick to some degree. In other words, cherry-picking might be an inevitable outcome of the optimization process itself. To demonstrate this, we use tools from fair cake-cutting, a mathematical subfield that studies the problem of fairly dividing a resource, referred to as the "cake," among a number of participants. This concept is connected to supervised multi-label classification: any dataset can be thought of as a cake that needs to be distributed among different labels, and the model is the function that divides the cake. We adapt these classical results for machine learning and demonstrate how this connection can be prolifically used for fairness and classification in general.

Cherry on the Cake: Fairness is NOT an Optimization Problem

TL;DR

The paper investigates cherry-picking as a potential consequence of fairness optimization in machine learning. By linking ML classification to cake-cutting, it shows that optimizing group-fairness constraints can inherently produce unfair within-group selections under many common fairness notions. It proves general theorems describing when non-cherry-picking solutions exist and when cherry-picking becomes inevitable, extends classical cake-cutting results to ML via the IPS framework, and discusses implications for practice. The work cautions against treating fairness as a pure optimization problem and advocates calibration and post-processing as more transparent routes to fair outcomes, while outlining future research directions and limitations. The findings have implications for the design of fair algorithms and for policy discussions about how to implement fairness in real systems.

Abstract

In Fair AI literature, the practice of maliciously creating unfair models that nevertheless satisfy fairness constraints is known as "cherry-picking". A cherry-picking model is a model that makes mistakes on purpose, selecting bad individuals from a minority class instead of better candidates from the same minority. The model literally cherry-picks whom to select to superficially meet the fairness constraints while making minimal changes to the unfair model. This practice has been described as "blatantly unfair" and has a negative impact on already marginalized communities, undermining the intended purpose of fairness measures specifically designed to protect these communities. A common assumption is that cherry-picking arises solely from malicious intent and that models designed only to optimize fairness metrics would avoid this behavior. We show that this is not the case: models optimized to minimize fairness metrics while maximizing performance are often forced to cherry-pick to some degree. In other words, cherry-picking might be an inevitable outcome of the optimization process itself. To demonstrate this, we use tools from fair cake-cutting, a mathematical subfield that studies the problem of fairly dividing a resource, referred to as the "cake," among a number of participants. This concept is connected to supervised multi-label classification: any dataset can be thought of as a cake that needs to be distributed among different labels, and the model is the function that divides the cake. We adapt these classical results for machine learning and demonstrate how this connection can be prolifically used for fairness and classification in general.

Paper Structure

This paper contains 15 sections, 25 theorems, 136 equations, 4 figures.

Key Result

Theorem 1

Consider a classification problem on $X$ with atomless measure $\mu_X$. There exists a vector of measures $\boldsymbol{\mu} = (\mu_1,\dots, \mu_n)$ such that the cake-cutting instance $(X , \Sigma_X, \boldsymbol{\mu})$ satisfies: for any matrix $M$.

Figures (4)

  • Figure 1: Optimal decisions according to Weller's theorem on the $\Delta^3$ simplex with labels $Y=\{\text{red},\text{green}, \text{blue} \}$. The RGB decomposition of a color represents the conditional distribution $P(\mathbf{y} \mid x)$.
  • Figure 2: $\mathop{\mathrm{IPS}}\nolimits$ and $\mathop{\mathrm{ROC}}\nolimits$ curve for a binary classification problem where half the population has $P(\mathbf{y} = 1\mid x)=0.9$ and the other half $P(\mathbf{y} = 1\mid x)=0.3$. Straight edges correspond to atoms of the push-forward measure $\mu_{\Delta}$.
  • Figure 3: A visualization of the proof for Theorem \ref{['teo_bad_fairness']}. After finding $( {\overline\eta}, {\overline\rho} )$ for a specific level set of ${ { {\mathcal{F}} } }$, the set $X$ is split into ${\mathop{\mathrm{\text{\female}}}\nolimits}$ and ${\mathop{\mathrm{\text{\male}}}\nolimits}$ and a new model is constructed based on the new level set ${ { {\mathcal{F}} } }$ on ${\mathop{\mathrm{\text{\male}}}\nolimits}$.
  • Figure 4: Visual aid for Example \ref{['example_revisited']}. The line is the constraint obtained by $k = 2/3$. When $t>k$, the Immediate Utility $U_t({\widehat{\mathbf{y}}})$ is better for the cherry-picking model than the fair one. The trivial model is optimal, but other evaluation metrics might prefer the cherry-picking model.

Theorems & Definitions (67)

  • Example 1
  • Definition 1: Cherry-Picking
  • Definition 2: Cake-cutting
  • Definition 3: Slicing
  • Definition 4: $\boldsymbol{\mu}(\boldsymbol{S})$ and $\mathop{\mathrm{IPS}}\nolimits$
  • Definition 5: Pareto pre-order
  • Definition 6
  • Theorem 1: Dvoretsky, Wald, and Wolfovitz's Theorem dvoretzky1951relations
  • Theorem 2
  • proof
  • ...and 57 more