Fairness Under Demographic Scarce Regime
Patrik Joslin Kenfack, Samira Ebrahimi Kahou, Ulrich Aïvodji
TL;DR
This work tackles fairness under a demographic scarce regime by using proxy-sensitive attributes learned from data where demographic labels are available in a source set but missing in the target. It introduces FairDSR, a two-phase framework: (i) training an uncertainty-aware attribute predictor via a self-ensembling, Monte Carlo dropout-based method to produce proxy attributes and their uncertainty, and (ii) enforcing fairness constraints only on samples with reliable proxy attributes. Empirical results across five real-world datasets show that applying fairness constraints to low-uncertainty samples yields significantly better fairness-accuracy tradeoffs than classic proxy methods and can outperform models trained with true sensitive attributes in several cases; uncertainty measures including conformal prediction also corroborate these findings. The approach highlights the critical role of uncertainty in the sensitive-attribute space for designing fair models when demographic information is incomplete, with practical implications for privacy-preserving and bias-aware deployments.
Abstract
Most existing works on fairness assume the model has full access to demographic information. However, there exist scenarios where demographic information is partially available because a record was not maintained throughout data collection or for privacy reasons. This setting is known as demographic scarce regime. Prior research has shown that training an attribute classifier to replace the missing sensitive attributes (proxy) can still improve fairness. However, using proxy-sensitive attributes worsens fairness-accuracy tradeoffs compared to true sensitive attributes. To address this limitation, we propose a framework to build attribute classifiers that achieve better fairness-accuracy tradeoffs. Our method introduces uncertainty awareness in the attribute classifier and enforces fairness on samples with demographic information inferred with the lowest uncertainty. We show empirically that enforcing fairness constraints on samples with uncertain sensitive attributes can negatively impact the fairness-accuracy tradeoff. Our experiments on five datasets showed that the proposed framework yields models with significantly better fairness-accuracy tradeoffs than classic attribute classifiers. Surprisingly, our framework can outperform models trained with fairness constraints on the true sensitive attributes in most benchmarks. We also show that these findings are consistent with other uncertainty measures such as conformal prediction.
