Differentially Private Federated Learning: Servers Trustworthiness, Estimation, and Statistical Inference

Zhe Zhang; Ryumei Nakada; Linjun Zhang

Differentially Private Federated Learning: Servers Trustworthiness, Estimation, and Statistical Inference

Zhe Zhang, Ryumei Nakada, Linjun Zhang

TL;DR

The paper addresses high‑dimensional estimation and inference under differential privacy in federated learning, contrasting untrusted versus trusted central servers. It first proves a minimax impossibility for accurate private mean estimation when the server is untrusted, highlighting dimension‑dependent rate penalties. Under a trusted server, it develops federated estimation and inference algorithms for homogeneous and heterogeneous models, achieving near‑optimal rates and providing debiased, private confidence intervals and a private bootstrap for simultaneous inference. Simulations corroborate the theoretical results, demonstrating practical viability for privacy‑preserving, multi‑site statistical analyses such as healthcare data collaboration.

Abstract

Differentially private federated learning is crucial for maintaining privacy in distributed environments. This paper investigates the challenges of high-dimensional estimation and inference under the constraints of differential privacy. First, we study scenarios involving an untrusted central server, demonstrating the inherent difficulties of accurate estimation in high-dimensional problems. Our findings indicate that the tight minimax rates depends on the high-dimensionality of the data even with sparsity assumptions. Second, we consider a scenario with a trusted central server and introduce a novel federated estimation algorithm tailored for linear regression models. This algorithm effectively handles the slight variations among models distributed across different machines. We also propose methods for statistical inference, including coordinate-wise confidence intervals for individual parameters and strategies for simultaneous inference. Extensive simulation experiments support our theoretical advances, underscoring the efficacy and reliability of our approaches.

Differentially Private Federated Learning: Servers Trustworthiness, Estimation, and Statistical Inference

TL;DR

Abstract

Paper Structure (35 sections, 21 theorems, 143 equations, 5 figures, 10 algorithms)

This paper contains 35 sections, 21 theorems, 143 equations, 5 figures, 10 algorithms.

Introduction
Overview
Related Work
Preliminaries
Differential Privacy
Federated Learning
Problem Formulation
An Impossibility Result in the Untrusted Central Server setting
Homogeneous Federated Learning Setting
Algorithms for Estimation Problems
Algorithms for Inference Problems
Theoretical Results
Heterogeneous Federated Learning Setting
Methods and Algorithms
Theoretical Results
...and 20 more sections

Key Result

Proposition 2.3

Let $f: \mathcal{X}^n \to \mathbb{R}^d$ be a deterministic algorithm with $\Delta_1(f)< \infty$. For $\bm w \in \mathbb{R}^d$ with coordinates $w_1, w_2, \cdots, w_d$ be i.i.d samples drawn from Laplace$(\Delta_1(f)/\epsilon)$, $f(\bm X) +\bm w$ is $(\epsilon, 0)$-differentially private.

Figures (5)

Figure 1: Federated Learning
Figure 2: Table for Simulation Results of the private federated linear regression
Figure 3: Confidence intervals for $\beta_k$ for each coordinate $k$ randomly selected from $800$ coordinates. vertical axis stands for the value of $\beta_k$. Red points stand for the true $\beta_k$ while black points stand for the estimated $\beta_k$. We mention that the result averaged over 50 iterations.
Figure 4: Plot for the estimation results. Left: Log estimation error with different number of samples $n$, Middle: Log estimation error with different sparsity $s^*$, Right: Log estimation error with different number of machines $m$.
Figure 5: Simulation results of the private simultaneous inference in different settings.

Theorems & Definitions (23)

Definition 2.1: Differential Privacy dwork2006calibrating
Definition 2.2
Proposition 2.3: The Laplace Mechanism dwork2006calibratingdwork2014algorithmic
Proposition 2.4: The Gaussian Mechanism dwork2006calibratingdwork2014algorithmic
Proposition 2.5: Post-processing Property dwork2006calibrating
Proposition 2.6: Composition property dwork2006calibrating
Theorem 1
Theorem 2
Theorem 3
Lemma 4.1
...and 13 more

Differentially Private Federated Learning: Servers Trustworthiness, Estimation, and Statistical Inference

TL;DR

Abstract

Differentially Private Federated Learning: Servers Trustworthiness, Estimation, and Statistical Inference

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (23)