Table of Contents
Fetching ...

FedSelect: Personalized Federated Learning with Customized Selection of Parameters for Fine-Tuning

Rishub Tamirisa, Chulin Xie, Wenxuan Bao, Andy Zhou, Ron Arel, Aviv Shamsian

TL;DR

FedSelect addresses data heterogeneity in federated learning by adaptively selecting which parameters to personalize. It draws on a gradient-based lottery ticket concept to grow per-client subnetworks while performing global aggregation on the remaining parameters, with a controllable personalization limit $\alpha$ and growth rate $p$. Through GradSelect, LocalAlt, and a per-index aggregation mechanism, FedSelect demonstrates state-of-the-art personalization across CIFAR-10, CIFAR-10C, Mini-ImageNet, and OfficeHome, showing robustness to distributional shifts. The approach reduces overfitting and maintains global performance, offering a practical and scalable avenue for personalized federated learning with real-world applicability.

Abstract

Standard federated learning approaches suffer when client data distributions have sufficient heterogeneity. Recent methods addressed the client data heterogeneity issue via personalized federated learning (PFL) - a class of FL algorithms aiming to personalize learned global knowledge to better suit the clients' local data distributions. Existing PFL methods usually decouple global updates in deep neural networks by performing personalization on particular layers (i.e. classifier heads) and global aggregation for the rest of the network. However, preselecting network layers for personalization may result in suboptimal storage of global knowledge. In this work, we propose FedSelect, a novel PFL algorithm inspired by the iterative subnetwork discovery procedure used for the Lottery Ticket Hypothesis. FedSelect incrementally expands subnetworks to personalize client parameters, concurrently conducting global aggregations on the remaining parameters. This approach enables the personalization of both client parameters and subnetwork structure during the training process. Finally, we show that FedSelect outperforms recent state-of-the-art PFL algorithms under challenging client data heterogeneity settings and demonstrates robustness to various real-world distributional shifts. Our code is available at https://github.com/lapisrocks/fedselect.

FedSelect: Personalized Federated Learning with Customized Selection of Parameters for Fine-Tuning

TL;DR

FedSelect addresses data heterogeneity in federated learning by adaptively selecting which parameters to personalize. It draws on a gradient-based lottery ticket concept to grow per-client subnetworks while performing global aggregation on the remaining parameters, with a controllable personalization limit and growth rate . Through GradSelect, LocalAlt, and a per-index aggregation mechanism, FedSelect demonstrates state-of-the-art personalization across CIFAR-10, CIFAR-10C, Mini-ImageNet, and OfficeHome, showing robustness to distributional shifts. The approach reduces overfitting and maintains global performance, offering a practical and scalable avenue for personalized federated learning with real-world applicability.

Abstract

Standard federated learning approaches suffer when client data distributions have sufficient heterogeneity. Recent methods addressed the client data heterogeneity issue via personalized federated learning (PFL) - a class of FL algorithms aiming to personalize learned global knowledge to better suit the clients' local data distributions. Existing PFL methods usually decouple global updates in deep neural networks by performing personalization on particular layers (i.e. classifier heads) and global aggregation for the rest of the network. However, preselecting network layers for personalization may result in suboptimal storage of global knowledge. In this work, we propose FedSelect, a novel PFL algorithm inspired by the iterative subnetwork discovery procedure used for the Lottery Ticket Hypothesis. FedSelect incrementally expands subnetworks to personalize client parameters, concurrently conducting global aggregations on the remaining parameters. This approach enables the personalization of both client parameters and subnetwork structure during the training process. Finally, we show that FedSelect outperforms recent state-of-the-art PFL algorithms under challenging client data heterogeneity settings and demonstrates robustness to various real-world distributional shifts. Our code is available at https://github.com/lapisrocks/fedselect.
Paper Structure (28 sections, 2 equations, 6 figures, 3 tables, 3 algorithms)

This paper contains 28 sections, 2 equations, 6 figures, 3 tables, 3 algorithms.

Figures (6)

  • Figure 1: Illustration of the FedSelect algorithm. An example subnetwork update for communication round $t=i$ into $t=i+1$ is depicted for $N$ clients, where 2 clients are shown. There are 4 key steps: (1) the local update / new partition via GradSelect (Algorithm \ref{['alg:GradSelect']}), (2) the aggregation of global parameters $v_k^t$, (3) broadcast of global parameters to the updated clients, and (4) application of the new mask $m_k^{t+1}$ as a partition for the global/personalized parameters of each client in the subsequent round $t+1$. In our algorithm, $u_k^t$ denotes the global parameters for each client at round $t$; $v_k$ denotes the personalized parameters for each client at round $t$; $u_k^{t^+}$ denotes the updated global parameters after GradSelect; $v_k^{t^+}$ denotes the updated personalized parameters after GradSelect; $m_k^{t+1}$ is the binary mask with "0" denoting global parameter and "1" denoting personalized parameter; $\theta_g^t$ is the aggregate global parameters materialized each round.
  • Figure 2: Test accuracy across communication rounds of FedSelect and baselines under the experimental settings in Table \ref{['table:table1']}. FedSelect outperforms all baselines and exhibits more stable convergence.
  • Figure 3: Personalized performance on CIFAR-10 with different local training data size and shard $s=2$. FedSelect outperforms prior methods.
  • Figure 4: Normalized intersection-over-union (IoU) overlap of the subnetwork masks $m_k$ in the final round in the ResNet18 final linear layer for each client in the CIFAR10 experiments from Table \ref{['tab:table2']}. Increasing $\alpha$ is shown from left-to-right. Each client was assigned 2 classes; the class labels are shown along the rows and columns of each matrix. Clients with similar labels develop similar subnetworks; increasing $\alpha$ results in more personalized parameters, but less distinct subnetworks.
  • Figure : FedSelect
  • ...and 1 more figures