Table of Contents
Fetching ...

Fine-Tuning Personalization in Federated Learning to Mitigate Adversarial Clients

Youssef Allouah, Abdellah El Mrini, Rachid Guerraoui, Nirupam Gupta, Rafael Pinot

TL;DR

This work analyzes the generalization performance of an interpolated personalized FL framework in the presence of adversarial clients, and precisely characterize situations when full collaboration performs strictly worse than fine-tuned personalization.

Abstract

Federated learning (FL) is an appealing paradigm that allows a group of machines (a.k.a. clients) to learn collectively while keeping their data local. However, due to the heterogeneity between the clients' data distributions, the model obtained through the use of FL algorithms may perform poorly on some client's data. Personalization addresses this issue by enabling each client to have a different model tailored to their own data while simultaneously benefiting from the other clients' data. We consider an FL setting where some clients can be adversarial, and we derive conditions under which full collaboration fails. Specifically, we analyze the generalization performance of an interpolated personalized FL framework in the presence of adversarial clients, and we precisely characterize situations when full collaboration performs strictly worse than fine-tuned personalization. Our analysis determines how much we should scale down the level of collaboration, according to data heterogeneity and the tolerable fraction of adversarial clients. We support our findings with empirical results on mean estimation and binary classification problems, considering synthetic and benchmark image classification datasets.

Fine-Tuning Personalization in Federated Learning to Mitigate Adversarial Clients

TL;DR

This work analyzes the generalization performance of an interpolated personalized FL framework in the presence of adversarial clients, and precisely characterize situations when full collaboration performs strictly worse than fine-tuned personalization.

Abstract

Federated learning (FL) is an appealing paradigm that allows a group of machines (a.k.a. clients) to learn collectively while keeping their data local. However, due to the heterogeneity between the clients' data distributions, the model obtained through the use of FL algorithms may perform poorly on some client's data. Personalization addresses this issue by enabling each client to have a different model tailored to their own data while simultaneously benefiting from the other clients' data. We consider an FL setting where some clients can be adversarial, and we derive conditions under which full collaboration fails. Specifically, we analyze the generalization performance of an interpolated personalized FL framework in the presence of adversarial clients, and we precisely characterize situations when full collaboration performs strictly worse than fine-tuned personalization. Our analysis determines how much we should scale down the level of collaboration, according to data heterogeneity and the tolerable fraction of adversarial clients. We support our findings with empirical results on mean estimation and binary classification problems, considering synthetic and benchmark image classification datasets.
Paper Structure (36 sections, 9 theorems, 53 equations, 7 figures, 1 algorithm)

This paper contains 36 sections, 9 theorems, 53 equations, 7 figures, 1 algorithm.

Key Result

Proposition 1

Consider the mean estimation setting described. For any $i \in \mathcal{C}$, let $y_i^\lambda$ be as defined in eq:mean_estimator with an aggregation rule $F$ that satisfies $(f,\kappa)$-robustness. Then the following holds true: with $\overline{\mu}_\mathcal{C} :=\frac{1}{n-f} \sum_{i\in \mathcal{C}} \mu_i$, $\Delta^2:= \frac{1}{n-f} \sum_{j \in \mathcal{C}} \left\lVert\mu_j - \overline{\mu}_\m

Figures (7)

  • Figure 1: Impact of the heterogeneity ($\sigma_h$), number of Byzantine adversaries $f$ and the task complexity ($\sigma$). (Top) The average error for different values of $\lambda$, computed using $20$ random experiments. (Bottom) Comparison of theoretical $\lambda^*$ and empirical minimizer of the error. We fixed the following default values: $n=600, f=100, m=20, \sigma=15, \sigma_h= 2$
  • Figure 2: Effect of adversarial fraction and heterogeneity and local sample size. (Top) Phishing dataset with logistic regression with $n=20$, $\alpha=3$. (Bottom) MNIST with a Convolutional Neural Network with $n=20$. $\alpha = \infty$ refers to the homogeneous setting.
  • Figure 3: Effect of Byzantine fraction. Binary MNIST with logistic regression. $m=256, \alpha = 3$
  • Figure 4: Effect of adversarial fraction and heterogeneity. Binary MNSIT dataset with logistic regression. $n=200, m=32$.
  • Figure 5: Local Vs FL performance on local test dataset. Phishing dataset with $n=20, \alpha =3$. As the number of local samples increases, the Byzantine fraction threshold above which local learning performs better than Robust Federated Learning gets smaller.
  • ...and 2 more figures

Theorems & Definitions (16)

  • Definition 1: $(f, \kappa)$-robustness
  • Proposition 1
  • Lemma 1
  • Lemma 2
  • Theorem 1
  • Definition 2: Polyak-Lojasiewicz (PL)
  • Proposition 1
  • proof
  • Lemma 2
  • proof
  • ...and 6 more