Table of Contents
Fetching ...

FedSelect: Customized Selection of Parameters for Fine-Tuning during Personalized Federated Learning

Rishub Tamirisa, John Won, Chengjun Lu, Ron Arel, Andy Zhou

TL;DR

FedSelect addresses the challenge of data heterogeneity in personalized federated learning by jointly personalizing client subnetworks and weights. It introduces GradLTN, a gradient-based lottery-ticket method that identifies a subnetwork to fine-tune locally while freezing the rest for global aggregation, and LocalAlt to perform alternating updates guided by the discovered masks. The approach achieves state-of-the-art mean accuracies on CIFAR-10 in a low-client, full-participation setting, with higher personalization rates $p$ generally reducing communication while preserving global knowledge. These results suggest that fine-grained parameter-level personalization, rather than layer-wise personalization, better preserves global knowledge and adapts to local distributions. The work opens avenues for applying parameter-level subnetworks to other non-IID FL benchmarks and exploring broader datasets.

Abstract

Recent advancements in federated learning (FL) seek to increase client-level performance by fine-tuning client parameters on local data or personalizing architectures for the local task. Existing methods for such personalization either prune a global model or fine-tune a global model on a local client distribution. However, these existing methods either personalize at the expense of retaining important global knowledge, or predetermine network layers for fine-tuning, resulting in suboptimal storage of global knowledge within client models. Enlightened by the lottery ticket hypothesis, we first introduce a hypothesis for finding optimal client subnetworks to locally fine-tune while leaving the rest of the parameters frozen. We then propose a novel FL framework, FedSelect, using this procedure that directly personalizes both client subnetwork structure and parameters, via the simultaneous discovery of optimal parameters for personalization and the rest of parameters for global aggregation during training. We show that this method achieves promising results on CIFAR-10.

FedSelect: Customized Selection of Parameters for Fine-Tuning during Personalized Federated Learning

TL;DR

FedSelect addresses the challenge of data heterogeneity in personalized federated learning by jointly personalizing client subnetworks and weights. It introduces GradLTN, a gradient-based lottery-ticket method that identifies a subnetwork to fine-tune locally while freezing the rest for global aggregation, and LocalAlt to perform alternating updates guided by the discovered masks. The approach achieves state-of-the-art mean accuracies on CIFAR-10 in a low-client, full-participation setting, with higher personalization rates generally reducing communication while preserving global knowledge. These results suggest that fine-grained parameter-level personalization, rather than layer-wise personalization, better preserves global knowledge and adapts to local distributions. The work opens avenues for applying parameter-level subnetworks to other non-IID FL benchmarks and exploring broader datasets.

Abstract

Recent advancements in federated learning (FL) seek to increase client-level performance by fine-tuning client parameters on local data or personalizing architectures for the local task. Existing methods for such personalization either prune a global model or fine-tune a global model on a local client distribution. However, these existing methods either personalize at the expense of retaining important global knowledge, or predetermine network layers for fine-tuning, resulting in suboptimal storage of global knowledge within client models. Enlightened by the lottery ticket hypothesis, we first introduce a hypothesis for finding optimal client subnetworks to locally fine-tune while leaving the rest of the parameters frozen. We then propose a novel FL framework, FedSelect, using this procedure that directly personalizes both client subnetwork structure and parameters, via the simultaneous discovery of optimal parameters for personalization and the rest of parameters for global aggregation during training. We show that this method achieves promising results on CIFAR-10.
Paper Structure (17 sections, 2 equations, 2 figures, 1 table, 2 algorithms)

This paper contains 17 sections, 2 equations, 2 figures, 1 table, 2 algorithms.

Figures (2)

  • Figure 1: Intersection-over-union overlap between all pairs of client masks found by FedSelect for the final ResNet18 linear layer, for $p=0.50$. Both the $s=2$ masks (left) and $s=4$ masks (right) exhibit significant diversity.
  • Figure 2: Average test accuracies of FedSelect on non-iid client partitions of CIFAR-10 when varying the GradLTN personalization rate $p$ for $s=2$ (left) and $s=4$ (right).