Table of Contents
Fetching ...

Federated Measurement of Demographic Disparities from Quantile Sketches

Arthur Charpentier, Agathe Fernandes Machado, Olivier Côté, François Hu

TL;DR

This work studies federated auditing of demographic parity through score distributions, measuring disparity as a Wasserstein--Frechet variance between sensitive-group score laws, and expressing the population metric in federated form that makes explicit how silo-specific selection drives local-global mismatch.

Abstract

Many fairness goals are defined at a population level that misaligns with siloed data collection, which remains unsharable due to privacy regulations. Horizontal federated learning (FL) enables collaborative modeling across clients with aligned features without sharing raw data. We study federated auditing of demographic parity through score distributions, measuring disparity as a Wasserstein--Frechet variance between sensitive-group score laws, and expressing the population metric in federated form that makes explicit how silo-specific selection drives local-global mismatch. For the squared Wasserstein distance, we prove an ANOVA-style decomposition that separates (i) selection-induced mixture effects from (ii) cross-silo heterogeneity, yielding tight bounds linking local and global metrics. We then propose a one-shot, communication-efficient protocol in which each silo shares only group counts and a quantile summary of its local score distributions, enabling the server to estimate global disparity and its decomposition, with $O(1/k)$ discretization bias ($k$ quantiles) and finite-sample guarantees. Experiments on synthetic data and COMPAS show that a few dozen quantiles suffice to recover global disparity and diagnose its sources.

Federated Measurement of Demographic Disparities from Quantile Sketches

TL;DR

This work studies federated auditing of demographic parity through score distributions, measuring disparity as a Wasserstein--Frechet variance between sensitive-group score laws, and expressing the population metric in federated form that makes explicit how silo-specific selection drives local-global mismatch.

Abstract

Many fairness goals are defined at a population level that misaligns with siloed data collection, which remains unsharable due to privacy regulations. Horizontal federated learning (FL) enables collaborative modeling across clients with aligned features without sharing raw data. We study federated auditing of demographic parity through score distributions, measuring disparity as a Wasserstein--Frechet variance between sensitive-group score laws, and expressing the population metric in federated form that makes explicit how silo-specific selection drives local-global mismatch. For the squared Wasserstein distance, we prove an ANOVA-style decomposition that separates (i) selection-induced mixture effects from (ii) cross-silo heterogeneity, yielding tight bounds linking local and global metrics. We then propose a one-shot, communication-efficient protocol in which each silo shares only group counts and a quantile summary of its local score distributions, enabling the server to estimate global disparity and its decomposition, with discretization bias ( quantiles) and finite-sample guarantees. Experiments on synthetic data and COMPAS show that a few dozen quantiles suffice to recover global disparity and diagnose its sources.
Paper Structure (82 sections, 26 theorems, 122 equations, 22 figures, 4 tables, 1 algorithm)

This paper contains 82 sections, 26 theorems, 122 equations, 22 figures, 4 tables, 1 algorithm.

Key Result

Proposition 3.1

Assume that for each $s\in\mathcal{S}$, $\nu_s$ has compact support and admits a continuous density bounded away from $0$ and $\infty$ on its support. Then, for $p\in\{1,2\}$, there exist constants $C_U,C_H<\infty$ (depending only on these bounds and on $p$) such that Moreover, for fixed $k$, the plug-in estimators $\widehat{U}_{k,p}$ and $\widehat{H}_{k,p}$ are consistent as $n\to\infty$, and th

Figures (22)

  • Figure 1: Synthetic Beta.Left: score distributions by group, $U_2=0.0076$. Right: $U_2(k)$ with $k\in\{1,2\}$ as a function of $\rho$.
  • Figure 2: Sensitivity to $k$ (synthetic). Convergence of $U_{2}(k)$, independence allocation (left) and selection bias (right).
  • Figure 3: COMPAS: score distributions and Wasserstein barycenter.Left: (beta-)kernel density estimates of the jittered score $Z$ by group. Middle: empirical CDFs $F_{\mathrm{AA}}$ and $F_{\mathrm{C}}$ together with the Wasserstein barycenter distribution (dashed), obtained by inverting the barycenter quantile $Q^\star$. Right: group quantile functions $Q_{\mathrm{AA}}$ and $Q_{\mathrm{C}}$ and their Wasserstein barycenter $Q^\star = \alpha_{\mathrm{AA}}Q_{\mathrm{AA}} + \alpha_{\mathrm{C}}Q_{\mathrm{C}}$ (dashed).
  • Figure 4: COMPAS: original score distributions and Wasserstein barycenter.Left: histograms of score $Z$ by group. Middle: empirical CDFs $F_{\mathrm{AA}}$ and $F_{\mathrm{C}}$ together with the Wasserstein barycenter distribution (dashed), obtained by inverting the barycenter quantile $Q^\star$. Right: group quantile functions $Q_{\mathrm{AA}}$ and $Q_{\mathrm{C}}$ and their Wasserstein barycenter $Q^\star = \alpha_{\mathrm{AA}}Q_{\mathrm{AA}} + \alpha_{\mathrm{C}}Q_{\mathrm{C}}$ (dashed).
  • Figure 5: COMPAS: convergence in $k$. Mean absolute error $\mathrm{MAE}(k)=\mathbb{E}\,|\widehat{U}_2(k)-U_2|$ as a function of the number of quantiles $k$ (sent per group and per silo) on a log scale, for several numbers of silos $d$ and for different allocation regimes (random vs. selection bias). Across regimes, $\widehat{U}_2(k)$ stabilizes quickly, with diminishing returns beyond a few dozen quantiles.
  • ...and 17 more figures

Theorems & Definitions (56)

  • Definition 2.1: Central unfairness functional
  • Definition 2.2: Heterogeneity functional
  • Remark : Two-group simplifications
  • Remark : Interior grids and trimmed functionals
  • Proposition 3.1: Consistency and discretization rate
  • Proposition 3.2: Bin-averaged discretization underestimates $U_2$
  • Definition 4.1: Federated demographic-parity functional
  • Proposition 4.2: Consistency with centralized targets
  • Proposition 4.3: Consistency and discretization rate
  • Proposition 4.4: High-probability control of communicated quantiles
  • ...and 46 more