Table of Contents
Fetching ...

Federated Two Stage Decoupling With Adaptive Personalization Layers

Hangyu Zhu, Yuxiang Fan, Zhenping Xie

TL;DR

A two-stage decoupling federated learning algorithm with adaptive personalization layers named FedTSDP, where client clustering is performed twice according to inference outputs and model weights, respectively, which has reliable performance on both IID and non-IID scenarios.

Abstract

Federated learning has gained significant attention due to its groundbreaking ability to enable distributed learning while maintaining privacy constraints. However, as a consequence of data heterogeneity among decentralized devices, it inherently experiences significant learning degradation and slow convergence speed. Therefore, it is natural to employ the concept of clustering homogeneous clients into the same group, allowing only the model weights within each group to be aggregated. While most existing clustered federated learning methods employ either model gradients or inference outputs as metrics for client partitioning, with the goal of grouping similar devices together, may still have heterogeneity within each cluster. Moreover, there is a scarcity of research exploring the underlying reasons for determining the appropriate timing for clustering, resulting in the common practice of assigning each client to its own individual cluster, particularly in the context of highly non independent and identically distributed (Non-IID) data. In this paper, we introduce a two-stage decoupling federated learning algorithm with adaptive personalization layers named FedTSDP, where client clustering is performed twice according to inference outputs and model weights, respectively. Hopkins amended sampling is adopted to determine the appropriate timing for clustering and the sampling weight of public unlabeled data. In addition, a simple yet effective approach is developed to adaptively adjust the personalization layers based on varying degrees of data skew. Experimental results show that our proposed method has reliable performance on both IID and non-IID scenarios.

Federated Two Stage Decoupling With Adaptive Personalization Layers

TL;DR

A two-stage decoupling federated learning algorithm with adaptive personalization layers named FedTSDP, where client clustering is performed twice according to inference outputs and model weights, respectively, which has reliable performance on both IID and non-IID scenarios.

Abstract

Federated learning has gained significant attention due to its groundbreaking ability to enable distributed learning while maintaining privacy constraints. However, as a consequence of data heterogeneity among decentralized devices, it inherently experiences significant learning degradation and slow convergence speed. Therefore, it is natural to employ the concept of clustering homogeneous clients into the same group, allowing only the model weights within each group to be aggregated. While most existing clustered federated learning methods employ either model gradients or inference outputs as metrics for client partitioning, with the goal of grouping similar devices together, may still have heterogeneity within each cluster. Moreover, there is a scarcity of research exploring the underlying reasons for determining the appropriate timing for clustering, resulting in the common practice of assigning each client to its own individual cluster, particularly in the context of highly non independent and identically distributed (Non-IID) data. In this paper, we introduce a two-stage decoupling federated learning algorithm with adaptive personalization layers named FedTSDP, where client clustering is performed twice according to inference outputs and model weights, respectively. Hopkins amended sampling is adopted to determine the appropriate timing for clustering and the sampling weight of public unlabeled data. In addition, a simple yet effective approach is developed to adaptively adjust the personalization layers based on varying degrees of data skew. Experimental results show that our proposed method has reliable performance on both IID and non-IID scenarios.
Paper Structure (24 sections, 11 equations, 11 figures, 6 tables, 1 algorithm)

This paper contains 24 sections, 11 equations, 11 figures, 6 tables, 1 algorithm.

Figures (11)

  • Figure 1: A simple example of FL with personalization layers. The filled blocks of each user are shallow layers, while the dashed blocks are personalization layers.
  • Figure 2: An illustrative example of clustered FL (boxes with the same color represent a cluster). ① Perform local training on connected clients. ② Perform clustering to partition clients into different groups. ③ Perform multi-center aggregation on the server. ④ Clients receive the model of the corresponding cluster from the server.
  • Figure 3: A simple FL example to demonstrate inadequacy of using model gradients as a clustering metric, where clients with same color indicate that they possess training data with the same label classes.
  • Figure 4: An example of the similarity matrix computed among 10 clients, where the value of each element is calculated by Eq. \ref{['eq:js']}. The smaller the value, the more similar two clients will be.
  • Figure 5: An example of two-stage decoupling process, where 5 client models are divided into two groups by the first stage of clustering. And the second stage is performed within each clustered group and continues to, for example, partition 3 client models located at the left panel into two distinct clusters.
  • ...and 6 more figures