Table of Contents
Fetching ...

On ADMM in Heterogeneous Federated Learning: Personalization, Robustness, and Fairness

Shengkun Zhu, Jinshan Zeng, Sheng Wang, Yuan Sun, Xiaodong Li, Yuan Yao, Zhiyong Peng

TL;DR

This paper proposes FLAME, an optimization framework by utilizing the alternating direction method of multipliers (ADMM) to train personalized and global models, and proposes a model selection strategy to improve performance in situations where clients have different types of heterogeneous data.

Abstract

Statistical heterogeneity is a root cause of tension among accuracy, fairness, and robustness of federated learning (FL), and is key in paving a path forward. Personalized FL (PFL) is an approach that aims to reduce the impact of statistical heterogeneity by developing personalized models for individual users, while also inherently providing benefits in terms of fairness and robustness. However, existing PFL frameworks focus on improving the performance of personalized models while neglecting the global model. Moreover, these frameworks achieve sublinear convergence rates and rely on strong assumptions. In this paper, we propose FLAME, an optimization framework by utilizing the alternating direction method of multipliers (ADMM) to train personalized and global models. We propose a model selection strategy to improve performance in situations where clients have different types of heterogeneous data. Our theoretical analysis establishes the global convergence and two kinds of convergence rates for FLAME under mild assumptions. We theoretically demonstrate that FLAME is more robust and fair than the state-of-the-art methods on a class of linear problems. Our experimental findings show that FLAME outperforms state-of-the-art methods in convergence and accuracy, and it achieves higher test accuracy under various attacks and performs more uniformly across clients.

On ADMM in Heterogeneous Federated Learning: Personalization, Robustness, and Fairness

TL;DR

This paper proposes FLAME, an optimization framework by utilizing the alternating direction method of multipliers (ADMM) to train personalized and global models, and proposes a model selection strategy to improve performance in situations where clients have different types of heterogeneous data.

Abstract

Statistical heterogeneity is a root cause of tension among accuracy, fairness, and robustness of federated learning (FL), and is key in paving a path forward. Personalized FL (PFL) is an approach that aims to reduce the impact of statistical heterogeneity by developing personalized models for individual users, while also inherently providing benefits in terms of fairness and robustness. However, existing PFL frameworks focus on improving the performance of personalized models while neglecting the global model. Moreover, these frameworks achieve sublinear convergence rates and rely on strong assumptions. In this paper, we propose FLAME, an optimization framework by utilizing the alternating direction method of multipliers (ADMM) to train personalized and global models. We propose a model selection strategy to improve performance in situations where clients have different types of heterogeneous data. Our theoretical analysis establishes the global convergence and two kinds of convergence rates for FLAME under mild assumptions. We theoretically demonstrate that FLAME is more robust and fair than the state-of-the-art methods on a class of linear problems. Our experimental findings show that FLAME outperforms state-of-the-art methods in convergence and accuracy, and it achieves higher test accuracy under various attacks and performs more uniformly across clients.
Paper Structure (54 sections, 19 theorems, 123 equations, 39 figures, 3 tables, 2 algorithms)

This paper contains 54 sections, 19 theorems, 123 equations, 39 figures, 3 tables, 2 algorithms.

Key Result

Proposition 1

If $f$ is a proper, lower semicontinuous, and weakly convex (or nonconvex with $L$-Lipschitz$\nabla f$) function, then $F$ is $L_F$-smooth with $L_F=\lambda$ (with the condition that $\lambda>2L$ for nonconvex $L$-smooth $f$), and the gradient of $F$ is defined as

Figures (39)

  • Figure 1: Four skew patterns: (a) the labels vary among different clients; (b) the features of the data differ among different clients, manifested as variations in the stroke thickness and slant angle; (c) the data quality varies, notably due to the presence of noise; (d) the quantity of data differs among clients.
  • Figure 2: An example of FLAME. Various clients may have different types of non-i.i.d. data (label and quantity skew). Selected clients download the global model from the server, and update their personalized models and local models, while unselected clients keep their model parameters unchanged. Client selection is performed on the server, and update messages $\{\boldsymbol{u}_i\}_{i=1}^m$ uploaded by clients are used to update the global model.
  • Figure 3: A comparison of the test accuracy across different methods. The dashed lines represent the global models, while the solid lines represent the personalized models. The personalized and global models of FLAME outperform other methods on four datasets.
  • Figure 8: Accuracy-fairness trade-off of competing methods (The point closer to the bottom right corner is better).
  • Figure 9: Effect of the regularization parameter $\lambda$ on the convergence of FLAME. As $\lambda$ increases, the performance of the personalized models becomes closer to that of the global models.
  • ...and 34 more figures

Theorems & Definitions (40)

  • Definition 1: Performance fairness
  • Definition 2: Robustness
  • Definition 3: Moreau envelope rockafellar2009variational
  • Proposition 1
  • Definition 4: Stationary point
  • Definition 5: Graph
  • Definition 6: Semicontinuous
  • Definition 7: Real analytic functionkrantz2002primer
  • Definition 8: Semialgebraic set and functionbochnak2013real
  • Lemma 1: Sufficient descent
  • ...and 30 more