Table of Contents
Fetching ...

Private Heterogeneous Federated Learning Without a Trusted Server Revisited: Error-Optimal and Communication-Efficient Algorithms for Convex Losses

Changyu Gao, Andrew Lowy, Xingyu Zhou, Stephen J. Wright

TL;DR

The paper tackles private federated learning in the absence of a trusted server under ISRL-DP, addressing heterogeneity across silos and the desire for fewer communication rounds. It introduces a localized ISRL-DP accelerated MB-SGD framework for smooth losses, achieving optimal excess risk in the heterogeneous setting with sharp, near-private lower-bound-matching communication complexity and improved gradient complexity. For nonsmooth losses, the authors develop smoothing-based and direct subgradient variants that preserve optimal ISRL-DP rates and offer favorable trade-offs between communication and computation. Theoretical results are complemented by MNIST-based experiments showing substantial practical gains over prior ISRL-DP methods, including robustness to unreliable communication. Overall, the work advances private FL by attaining error-optimal performance without assuming data homogeneity and by delivering improved efficiency in both communication and computation, with open questions about lower bounds and universal optimality across regimes.

Abstract

We revisit the problem of federated learning (FL) with private data from people who do not trust the server or other silos/clients. In this context, every silo (e.g. hospital) has data from several people (e.g. patients) and needs to protect the privacy of each person's data (e.g. health records), even if the server and/or other silos try to uncover this data. Inter-Silo Record-Level Differential Privacy (ISRL-DP) prevents each silo's data from being leaked, by requiring that silo i's communications satisfy item-level differential privacy. Prior work arXiv:2106.09779 characterized the optimal excess risk bounds for ISRL-DP algorithms with homogeneous (i.i.d.) silo data and convex loss functions. However, two important questions were left open: (1) Can the same excess risk bounds be achieved with heterogeneous (non-i.i.d.) silo data? (2) Can the optimal risk bounds be achieved with fewer communication rounds? In this paper, we give positive answers to both questions. We provide novel ISRL-DP FL algorithms that achieve the optimal excess risk bounds in the presence of heterogeneous silo data. Moreover, our algorithms are more communication-efficient than the prior state-of-the-art. For smooth loss functions, our algorithm achieves the optimal excess risk bound and has communication complexity that matches the non-private lower bound. Additionally, our algorithms are more computationally efficient than the previous state-of-the-art.

Private Heterogeneous Federated Learning Without a Trusted Server Revisited: Error-Optimal and Communication-Efficient Algorithms for Convex Losses

TL;DR

The paper tackles private federated learning in the absence of a trusted server under ISRL-DP, addressing heterogeneity across silos and the desire for fewer communication rounds. It introduces a localized ISRL-DP accelerated MB-SGD framework for smooth losses, achieving optimal excess risk in the heterogeneous setting with sharp, near-private lower-bound-matching communication complexity and improved gradient complexity. For nonsmooth losses, the authors develop smoothing-based and direct subgradient variants that preserve optimal ISRL-DP rates and offer favorable trade-offs between communication and computation. Theoretical results are complemented by MNIST-based experiments showing substantial practical gains over prior ISRL-DP methods, including robustness to unreliable communication. Overall, the work advances private FL by attaining error-optimal performance without assuming data homogeneity and by delivering improved efficiency in both communication and computation, with open questions about lower bounds and universal optimality across regimes.

Abstract

We revisit the problem of federated learning (FL) with private data from people who do not trust the server or other silos/clients. In this context, every silo (e.g. hospital) has data from several people (e.g. patients) and needs to protect the privacy of each person's data (e.g. health records), even if the server and/or other silos try to uncover this data. Inter-Silo Record-Level Differential Privacy (ISRL-DP) prevents each silo's data from being leaked, by requiring that silo i's communications satisfy item-level differential privacy. Prior work arXiv:2106.09779 characterized the optimal excess risk bounds for ISRL-DP algorithms with homogeneous (i.i.d.) silo data and convex loss functions. However, two important questions were left open: (1) Can the same excess risk bounds be achieved with heterogeneous (non-i.i.d.) silo data? (2) Can the optimal risk bounds be achieved with fewer communication rounds? In this paper, we give positive answers to both questions. We provide novel ISRL-DP FL algorithms that achieve the optimal excess risk bounds in the presence of heterogeneous silo data. Moreover, our algorithms are more communication-efficient than the prior state-of-the-art. For smooth loss functions, our algorithm achieves the optimal excess risk bound and has communication complexity that matches the non-private lower bound. Additionally, our algorithms are more computationally efficient than the previous state-of-the-art.
Paper Structure (37 sections, 25 theorems, 94 equations, 3 figures, 2 tables, 7 algorithms)

This paper contains 37 sections, 25 theorems, 94 equations, 3 figures, 2 tables, 7 algorithms.

Key Result

Theorem 2.1

Let $f(\cdot, x)$ be $\beta$-smooth and $M=N$. Assume $\varepsilon \leq 2 \ln(2/\delta), \delta \in (0,1)$. Then, there exist parameter choices such that alg:phased_acc is $(\varepsilon,\delta)$-ISRL-DP and has the following excess risk Moreover, the communication complexity of alg:phased_acc is Assuming $d = \Theta(n)$ and $\varepsilon = \Theta(1)$, the gradient complexity of alg:phased_acc is

Figures (3)

  • Figure 1: ISRL-DP maintains the privacy of each patient's record, provided the patient's own hospital is trusted. Silo $i$'s messages are item-level DP, preventing data leakage, even if the server/other silos collude to decode the data of silo $i$.
  • Figure 2: Reliable Communication
  • Figure 3: Unreliable Communication

Theorems & Definitions (51)

  • Definition 1.1: Differential Privacy dwork2006calibrating
  • Definition 1.2: Inter-Silo Record-Level Differential Privacy
  • Theorem 2.1: Upper Bound for Smooth Losses
  • Remark 2.2: Optimal risk in non-i.i.d private FL
  • Remark 2.3: Improved communication and gradient complexity
  • Theorem 2.4: Communication Lower Bound woodworth2020minibatch
  • Remark 2.5
  • proof : Proof sketch
  • Theorem 3.1: Nonsmooth FL via Nesterov smoothing
  • Theorem 3.2: Nonsmooth FL via convolutional smoothing
  • ...and 41 more