Table of Contents
Fetching ...

Efficient Federated Conformal Prediction with Group-Conditional Guarantees

Haifeng Wen, Osvaldo Simeone, Hong Xing

Abstract

Deploying trustworthy AI systems requires principled uncertainty quantification. Conformal prediction (CP) is a widely used framework for constructing prediction sets with distribution-free coverage guarantees. In many practical settings, including healthcare, finance, and mobile sensing, the calibration data required for CP are distributed across multiple clients, each with its own local data distribution. In this federated setting, data can often be partitioned into, potentially overlapping, groups, which may reflect client-specific strata or cross-cutting attributes such as demographic or semantic categories. We propose group-conditional federated conformal prediction (GC-FCP), a novel protocol that provides group-conditional coverage guarantees. GC-FCP constructs mergeable, group-stratified coresets from local calibration scores, enabling clients to communicate compact weighted summaries that support efficient aggregation and calibration at the server. Experiments on synthetic and real-world datasets validate the performance of GC-FCP compared to centralized calibration baselines.

Efficient Federated Conformal Prediction with Group-Conditional Guarantees

Abstract

Deploying trustworthy AI systems requires principled uncertainty quantification. Conformal prediction (CP) is a widely used framework for constructing prediction sets with distribution-free coverage guarantees. In many practical settings, including healthcare, finance, and mobile sensing, the calibration data required for CP are distributed across multiple clients, each with its own local data distribution. In this federated setting, data can often be partitioned into, potentially overlapping, groups, which may reflect client-specific strata or cross-cutting attributes such as demographic or semantic categories. We propose group-conditional federated conformal prediction (GC-FCP), a novel protocol that provides group-conditional coverage guarantees. GC-FCP constructs mergeable, group-stratified coresets from local calibration scores, enabling clients to communicate compact weighted summaries that support efficient aggregation and calibration at the server. Experiments on synthetic and real-world datasets validate the performance of GC-FCP compared to centralized calibration baselines.
Paper Structure (38 sections, 8 theorems, 90 equations, 5 figures, 5 tables, 1 algorithm)

This paper contains 38 sections, 8 theorems, 90 equations, 5 figures, 5 tables, 1 algorithm.

Key Result

Theorem 4.1

For every group $G\in\mathcal{G}$, the set eq:condcp_set with the empirical quantile eq:gcfcp_augqr produced by centralized GC-FCP satisfies the conditional coverage condition eq:group_conditional_coverage.

Figures (5)

  • Figure 1: Left: In a federated system with heterogeneous clients, each client $k$ holds local calibration data $\mathcal{D}_k$ and communicates over a bandwidth-limited channel to a central server. The server aggregates these summaries to perform conformal calibration (CP) and outputs a set-valued predictor $\mathcal{C}(\cdot\mid\mathcal{D})$. Right: GC-FCP targets group-conditional coverage for potentially overlapping groups $\mathcal{G}=\{G_1,G_2,G_3\}$, ensuring the inequality $\mathbb P(Y\in \mathcal{C}(X\mid\mathcal{D})\mid X\in G)\ge 1-\alpha$ for all groups $G \in \mathcal{G}$, whereas methods that only guarantee marginal coverage $\mathbb P(Y\in \mathcal{C}(X\mid\mathcal{D}))\ge 1-\alpha$, such as FCP lu2023federated, may still under-cover within specific groups.
  • Figure 2: Illustration of the atom partitions applied by GC-FCP. Given the set of overlapping groups $\mathcal{G}=\{G_1,G_2,G_3\}$, the resulting $7$ non-empty atoms $\mathcal{A}=\{A_1,\ldots,A_7\}$ are shown on the right, together with the corresponding group-membership vector \ref{['eq:membership_vector']}.
  • Figure 3: Visualization of prediction sets for the synthetic regression task for (a) centralized CP and FedCP lu2023federated, as well as for (b) centralized CondCP, centralized GC-FCP, and GC-FCP. (c) Per-group miscoverage rate.
  • Figure 4: Average coverage and set size versus coverage level $1-\alpha$ with vanilla CP and the proposed GC-FCP on CIFAR-10.
  • Figure 5: Average coverage and set size versus coverage level $1-\alpha$ with vanilla CP and the proposed GC-FCP on PathMNIST.

Theorems & Definitions (25)

  • Theorem 4.1: Group-conditional coverage for centralized GC-FCP
  • Lemma 5.1: Uniform bound of T-Digest
  • proof
  • Theorem 5.1: Group-conditional coverage guarantees for GC-FCP
  • proof : Proof
  • Lemma B.1: CDF error controlled by maximal cluster mass
  • proof
  • Lemma B.2: Arcsine scale implies $\rho_{\max}\le \sin(\pi/\delta)$
  • proof
  • Lemma B.3: From $\|F-\widehat{F}\|_\infty$ to rank-accurate quantiles
  • ...and 15 more