Table of Contents
Fetching ...

Federated Computation of ROC and PR Curves

Xuefeng Xu, Graham Cormode

TL;DR

The paper tackles the challenge of evaluating binary classifiers in federated learning without sharing raw scores by proposing a quantile-based, privacy-preserving method to approximate ROC and PR curves. It combines histogram-based quantile estimation with distributed differential privacy and monotone interpolation to reconstruct curves from quantiles, achieving provable area-error bounds of $O(1/Q)$ for ROC and $\tilde{O}(1/Q)$ for PR (with data-imbalance dependent refinements) while keeping communication linear in $Q$. The approach demonstrates strong empirical performance on real datasets under privacy constraints, with favorable comparisons to prior range-query methods and robust behavior under varying class distributions and score models. The method extends to multi-class settings and other metrics, offering a practical, scalable solution for privacy-preserving model evaluation in federated systems and enabling more reliable deployment decisions in privacy-sensitive domains.

Abstract

Receiver Operating Characteristic (ROC) and Precision-Recall (PR) curves are fundamental tools for evaluating machine learning classifiers, offering detailed insights into the trade-offs between true positive rate vs. false positive rate (ROC) or precision vs. recall (PR). However, in Federated Learning (FL) scenarios, where data is distributed across multiple clients, computing these curves is challenging due to privacy and communication constraints. Specifically, the server cannot access raw prediction scores and class labels, which are used to compute the ROC and PR curves in a centralized setting. In this paper, we propose a novel method for approximating ROC and PR curves in a federated setting by estimating quantiles of the prediction score distribution under distributed differential privacy. We provide theoretical bounds on the Area Error (AE) between the true and estimated curves, demonstrating the trade-offs between approximation accuracy, privacy, and communication cost. Empirical results on real-world datasets demonstrate that our method achieves high approximation accuracy with minimal communication and strong privacy guarantees, making it practical for privacy-preserving model evaluation in federated systems.

Federated Computation of ROC and PR Curves

TL;DR

The paper tackles the challenge of evaluating binary classifiers in federated learning without sharing raw scores by proposing a quantile-based, privacy-preserving method to approximate ROC and PR curves. It combines histogram-based quantile estimation with distributed differential privacy and monotone interpolation to reconstruct curves from quantiles, achieving provable area-error bounds of for ROC and for PR (with data-imbalance dependent refinements) while keeping communication linear in . The approach demonstrates strong empirical performance on real datasets under privacy constraints, with favorable comparisons to prior range-query methods and robust behavior under varying class distributions and score models. The method extends to multi-class settings and other metrics, offering a practical, scalable solution for privacy-preserving model evaluation in federated systems and enabling more reliable deployment decisions in privacy-sensitive domains.

Abstract

Receiver Operating Characteristic (ROC) and Precision-Recall (PR) curves are fundamental tools for evaluating machine learning classifiers, offering detailed insights into the trade-offs between true positive rate vs. false positive rate (ROC) or precision vs. recall (PR). However, in Federated Learning (FL) scenarios, where data is distributed across multiple clients, computing these curves is challenging due to privacy and communication constraints. Specifically, the server cannot access raw prediction scores and class labels, which are used to compute the ROC and PR curves in a centralized setting. In this paper, we propose a novel method for approximating ROC and PR curves in a federated setting by estimating quantiles of the prediction score distribution under distributed differential privacy. We provide theoretical bounds on the Area Error (AE) between the true and estimated curves, demonstrating the trade-offs between approximation accuracy, privacy, and communication cost. Empirical results on real-world datasets demonstrate that our method achieves high approximation accuracy with minimal communication and strong privacy guarantees, making it practical for privacy-preserving model evaluation in federated systems.

Paper Structure

This paper contains 32 sections, 6 theorems, 34 equations, 20 figures, 1 table, 1 algorithm.

Key Result

Theorem 4.2

If $Q$ exact quantiles are used for both positive and negative examples, then the Area Error between the true and estimated ROC curves is bounded by $O(1/Q)$.

Figures (20)

  • Figure 1: ECDF approximation using quantiles for negative and positive classes.
  • Figure 2: Approximated ROC and PR curves constructed from ECDFs.
  • Figure 3: Interpolation method comparison (ROC, XGBoost).
  • Figure 4: Effect of $\varepsilon$ (ROC, XGBoost).
  • Figure 5: Comparison of strategies for PR curve (XGBoost).
  • ...and 15 more figures

Theorems & Definitions (12)

  • Definition 4.1: AE
  • Theorem 4.2: ROC-AE
  • proof : Proof sketch
  • Theorem 4.3: PR-AE
  • proof : Proof sketch
  • Theorem 4.4: AE-SA
  • Theorem 4.5: AE-DDP
  • Definition E.1: Well-behaved score distribution
  • Lemma E.2
  • proof
  • ...and 2 more