Federated learning with differential privacy and an untrusted aggregator

Kunlong Liu; Trinabh Gupta

Federated learning with differential privacy and an untrusted aggregator

Kunlong Liu, Trinabh Gupta

TL;DR

An evaluation of Aero demonstrates that it provides comparable accuracy to plain federated learning (without differential privacy), and it improves efficiency (CPU and network) over Orchard by up to $10^5\times$.

Abstract

Federated learning for training models over mobile devices is gaining popularity. Current systems for this task exhibit significant trade-offs between model accuracy, privacy guarantee, and device efficiency. For instance, Oort (OSDI 2021) provides excellent accuracy and efficiency but requires a trusted central server. On the other hand, Orchard (OSDI 2020) provides good accuracy and the rigorous guarantee of differential privacy over an untrusted server, but creates huge overhead for the devices. This paper describes Aero, a new federated learning system that significantly improves this trade-off. Aero guarantees good accuracy, differential privacy over an untrusted server, and keeps the device overhead low. The key idea of Aero is to tune system architecture and design to a specific set of popular, federated learning algorithms. This tuning requires novel optimizations and techniques, e.g., a new protocol to securely aggregate updates from devices. An evaluation of Aero demonstrates that it provides comparable accuracy to plain federated learning (without differential privacy), and it improves efficiency (CPU and network) over Orchard by up to $10^5\times$.

Federated learning with differential privacy and an untrusted aggregator

TL;DR

An evaluation of Aero demonstrates that it provides comparable accuracy to plain federated learning (without differential privacy), and it improves efficiency (CPU and network) over Orchard by up to

Abstract

Paper Structure (28 sections, 1 theorem, 27 equations, 10 figures)

This paper contains 28 sections, 1 theorem, 27 equations, 10 figures.

Introduction
Problem and background
Scenario and threat model
Goals
Possible solution approaches
Overview of Aero
DP-FedAvg without amplification
Architecture of Aero
Protocol overview of Aero
Design of Aero
Setup phase
Generate phase
Add phase
Release phase
Privacy proof
...and 13 more sections

Key Result

lemma 1

Given any function $f$, whose norm $\|f(\cdot)\|_2 \le 1$, let $z\geq 1$ be some noise scale and $\sigma=z\cdot \|f(\cdot)\|_2$, let $d=\{d_1,...,d_n\}$ be a database, let $\mathcal{J}$ be a sample from $[n]$ where each $i\in [n]$ is chosen independently with probability $q\le \frac{1}{16\sigma}$, t

Figures (10)

Figure 1: An overview of Orchard roth2020orchard. $\Delta_k$ denotes $k$-th device's update. The superscript $t$ denotes the round number. Orchard runs the four phases of setup, generate, add, and release for every round.
Figure 2: Pseudocode for the DP-FedAvg algorithm. $Clip(\cdot, S)$ scales its input vector such that its norm (Euclidean distance from the origin) is less than $S$. $\mathcal{M}$ is the privacy budget accountant of Abadi et al. abadi2016deep that tracks the values of the DP parameters $\epsilon$ and $\delta$.
Figure 3: An overview of Aero's architecture and the four phases of its protocol.
Figure 4: Aero's verifiable aggregation. This description does not include the PIT optimization (described in text) that applies to line \ref{['l:addphase:verifynonleaf']}.
Figure 5: An overview of Aero's implementation.
...and 5 more figures

Theorems & Definitions (11)

lemma 1
Claim A.1
proof
Claim A.2
proof
Claim A.3
proof
Claim A.4
proof
Claim A.5
...and 1 more

Federated learning with differential privacy and an untrusted aggregator

TL;DR

Abstract

Federated learning with differential privacy and an untrusted aggregator

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (11)