Federated Transfer Learning with Differential Privacy

Mengchu Li; Ye Tian; Yang Feng; Yi Yu

Federated Transfer Learning with Differential Privacy

Mengchu Li, Ye Tian, Yang Feng, Yi Yu

TL;DR

This work introduces Federated Transfer Learning with Differential Privacy (FDP), a framework that enables learning on a target dataset by leveraging multiple heterogeneous sources under site-specific privacy constraints without a trusted central server. It formalizes a minimax analysis under FDP, defines the informative-source set, and develops adaptive procedures to select informative sources, ensuring privacy while improving estimation when sources are similar. The authors provide rigorous upper and lower bounds for univariate mean estimation, low-dimensional linear regression, and high-dimensional regression, showing that FDP interpolates between central DP and Local DP and can yield gains through knowledge transfer when heterogeneity is small and informative sources are abundant. Numerical experiments confirm the theoretical predictions, illustrating how FDP balances privacy, heterogeneity, and transfer in both homogeneous and heterogeneous settings, and highlighting practical considerations such as source dropout and adaptive inference.

Abstract

Federated learning has emerged as a powerful framework for analysing distributed data, yet two challenges remain pivotal: heterogeneity across sites and privacy of local data. In this paper, we address both challenges within a federated transfer learning framework, aiming to enhance learning on a target data set by leveraging information from multiple heterogeneous source data sets while adhering to privacy constraints. We rigorously formulate the notion of federated differential privacy, which offers privacy guarantees for each data set without assuming a trusted central server. Under this privacy model, we study three classical statistical problems: univariate mean estimation, low-dimensional linear regression, and high-dimensional linear regression. By investigating the minimax rates and quantifying the cost of privacy in each problem, we show that federated differential privacy is an intermediate privacy model between the well-established local and central models of differential privacy. Our analyses account for data heterogeneity and privacy, highlighting the fundamental costs associated with each factor and the benefits of knowledge transfer in federated learning.

Federated Transfer Learning with Differential Privacy

TL;DR

Abstract

Paper Structure (45 sections, 24 theorems, 278 equations, 4 figures, 4 tables, 8 algorithms)

This paper contains 45 sections, 24 theorems, 278 equations, 4 figures, 4 tables, 8 algorithms.

Introduction
Federated transfer learning
Privacy framework
Minimax risk under FDP constraints
Contributions
Related work
Notation
Univariate Mean Estimation
Federated private mean estimation
Optimality and minimax lower bound
Low-Dimensional Linear Regression
Federated private linear regression
Optimality and minimax lower bound
High-Dimensional Linear Regression
Federated private high-dimensional linear regression
...and 30 more sections

Key Result

Theorem 1

Given data $D_0$ and $\{D_k\}_{k \in [K]}$, with parameters from $\Theta(\mathcal{A}, h)$ defined in eq-Theta-2-univ, suppose that $\min_{k \not\in \mathcal{A}} \alpha^{(k)} \geq c f(n_0,\eta,\epsilon)$ with $f(\cdot, \cdot, \cdot)$ defined in eq:private_mean_guarantee, for some sufficiently large a Then for $\tilde{\mu}$ defined in eq:weighted_mean, with $\hat{\mathcal{A}}$ defined in eq:mean-hat

Figures (4)

Figure 1: An illustration of the privacy mechanisms that satisfy \ref{['def:interactive-FDP']}. For $t \in [T]$ and $k \in \{0\} \cup [K]$, $D_k^t$ and $Z_k^t$ refer to the data used in round $t$ at site $k$ and the communicated private information in round $t$ from site $k$, respectively; $B^t = (\{Z_k^t\}_{k \in \{0\} \cup [K]}, B^{t-1})$ is the set of private information from all $K+1$ sites up to round $t$. Privacy mechanisms are applied to obtain each $Z_k^t$ using the information in $D_k^t$ and $B^{t-1}$.
Figure 2: Comparison of estimation errors under different DP notions, when the sample size $n$ (left) or the privacy parameter $\epsilon$ (right) changes.
Figure 3: Performance of different methods under varying degrees of heterogeneity between target and sources.
Figure 4: An illustration of the informative source detection strategy. The blue dash-line circle denotes the range of similarity levels between the target and sources in $\mathcal{A}$. The red dash-dot-line circle represents the threshold for determining the informative set $\hat{\mathcal{A}}$. Each gray dot-line circle refers to the estimation error range of each $\theta^{(k)}$ by using local data from each site. In this example, $\hat{\mathcal{A}} = \mathcal{A} = \{1,2,3\}$ and the outlier source index set $\mathcal{A}^c = \{4, 5\}$.

Theorems & Definitions (45)

Definition 1: Federated Differential Privacy, FDP
Remark 1
Theorem 1
Theorem 2
Theorem 3
Theorem 4
Remark 2
Theorem 5
Lemma 6
proof : Proof of \ref{['lemma:selection-consistency']}
...and 35 more

Federated Transfer Learning with Differential Privacy

TL;DR

Abstract

Federated Transfer Learning with Differential Privacy

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (45)