Conformal Prediction Sets Can Cause Disparate Impact
Jesse C. Cresswell, Bhargava Kumar, Yi Sui, Mouloud Belbahri
TL;DR
Conformal prediction provides prediction sets with coverage $P[y \in \mathcal{C}(x)] \ge 1-\alpha$, but deploying these sets in human-in-the-loop decisions can yield disparate impact measured by $\Delta_t = \max_{a,b \in \mathcal{G}} (\delta_{t,a}-\delta_{t,b})$. Through pre-registered randomized trials across three tasks, the paper shows that Equalized Coverage (Mondrian CP) often increases disparity relative to marginal CP, and that focusing on Equalized Set Size or Equalized Singleton Frequency better correlates with reduced unfairness. The authors analyze factors such as coverage, adoption, set size, and singleton frequency to explain why set-based fairness diverges from coverage-based fairness. The work provides practical guidance for deploying CP in real-world, human-in-the-loop settings and highlights that fairness metrics should emphasize outcome-oriented balance rather than purely coverage parity.
Abstract
Conformal prediction is a statistically rigorous method for quantifying uncertainty in models by having them output sets of predictions, with larger sets indicating more uncertainty. However, prediction sets are not inherently actionable; many applications require a single output to act on, not several. To overcome this limitation, prediction sets can be provided to a human who then makes an informed decision. In any such system it is crucial to ensure the fairness of outcomes across protected groups, and researchers have proposed that Equalized Coverage be used as the standard for fairness. By conducting experiments with human participants, we demonstrate that providing prediction sets can lead to disparate impact in decisions. Disquietingly, we find that providing sets that satisfy Equalized Coverage actually increases disparate impact compared to marginal coverage. Instead of equalizing coverage, we propose to equalize set sizes across groups which empirically leads to lower disparate impact.
