Table of Contents
Fetching ...

Adaptive Model Pruning and Personalization for Federated Learning over Wireless Networks

Xiaonan Liu, Tharmalingam Ratnarajah, Mathini Sellathurai, Yonina C. Eldar

TL;DR

This work addresses the latency and accuracy challenges of federated learning over wireless networks with heterogeneous data. It proposes a principled framework that partitions the model into a global pruned part and device-specific personalized parts, enabling reduced computation and communication while mitigating non IID data effects. A convergence analysis provides an upper bound on gradient norms and demonstrates how pruning influences performance, while KKT-based optimization yields closed-form solutions for pruning ratios and bandwidth allocations under latency constraints. Experiments on CNNs with non IID datasets show that the proposed approach achieves comparable learning accuracy to full personalization while cutting computation and communication latency by about 50%, highlighting its practical impact for resource-constrained edge deployments.

Abstract

Federated learning (FL) enables distributed learning across edge devices while protecting data privacy. However, the learning accuracy decreases due to the heterogeneity of devices' data, and the computation and communication latency increase when updating large-scale learning models on devices with limited computational capability and wireless resources. We consider a FL framework with partial model pruning and personalization to overcome these challenges. This framework splits the learning model into a global part with model pruning shared with all devices to learn data representations and a personalized part to be fine-tuned for a specific device, which adapts the model size during FL to reduce both computation and communication latency and increases the learning accuracy for devices with non-independent and identically distributed data. The computation and communication latency and convergence of the proposed FL framework are mathematically analyzed. To maximize the convergence rate and guarantee learning accuracy, Karush Kuhn Tucker (KKT) conditions are deployed to jointly optimize the pruning ratio and bandwidth allocation. Finally, experimental results demonstrate that the proposed FL framework achieves a remarkable reduction of approximately 50 percent computation and communication latency compared with FL with partial model personalization.

Adaptive Model Pruning and Personalization for Federated Learning over Wireless Networks

TL;DR

This work addresses the latency and accuracy challenges of federated learning over wireless networks with heterogeneous data. It proposes a principled framework that partitions the model into a global pruned part and device-specific personalized parts, enabling reduced computation and communication while mitigating non IID data effects. A convergence analysis provides an upper bound on gradient norms and demonstrates how pruning influences performance, while KKT-based optimization yields closed-form solutions for pruning ratios and bandwidth allocations under latency constraints. Experiments on CNNs with non IID datasets show that the proposed approach achieves comparable learning accuracy to full personalization while cutting computation and communication latency by about 50%, highlighting its practical impact for resource-constrained edge deployments.

Abstract

Federated learning (FL) enables distributed learning across edge devices while protecting data privacy. However, the learning accuracy decreases due to the heterogeneity of devices' data, and the computation and communication latency increase when updating large-scale learning models on devices with limited computational capability and wireless resources. We consider a FL framework with partial model pruning and personalization to overcome these challenges. This framework splits the learning model into a global part with model pruning shared with all devices to learn data representations and a personalized part to be fine-tuned for a specific device, which adapts the model size during FL to reduce both computation and communication latency and increases the learning accuracy for devices with non-independent and identically distributed data. The computation and communication latency and convergence of the proposed FL framework are mathematically analyzed. To maximize the convergence rate and guarantee learning accuracy, Karush Kuhn Tucker (KKT) conditions are deployed to jointly optimize the pruning ratio and bandwidth allocation. Finally, experimental results demonstrate that the proposed FL framework achieves a remarkable reduction of approximately 50 percent computation and communication latency compared with FL with partial model personalization.
Paper Structure (27 sections, 78 equations, 6 figures, 1 table, 1 algorithm)

This paper contains 27 sections, 78 equations, 6 figures, 1 table, 1 algorithm.

Figures (6)

  • Figure 1: FL with partial model pruning and personalization framework.
  • Figure 2: Selected personalized and global parts in CNN.
  • Figure 3: (a) The relationship between the loss value of the proposed FL and pruning ratios on the non-IID MNIST dataset. (b) The relationship between the loss value of the proposed FL and pruning ratios on the non-IID Fashion MNIST dataset. (c) The performance of the testing accuracy with increasing pruning ratio of the proposed FL.
  • Figure 4: Comparison of the testing accuracy of the proposed FL by alternatively and simultaneously local updating.
  • Figure 5: (a) Loss value comparison of joint optimization of pruning ratio and bandwidth fraction of the proposed FL framework with other three FL schemes. (b) Testing accuracy comparison of joint optimization of pruning ratio and bandwidth fraction of the proposed FL framework with other three FL schemes. (c) Comparison of communication costs on different FL schemes.
  • ...and 1 more figures

Theorems & Definitions (7)

  • proof
  • proof
  • proof
  • proof
  • proof
  • proof
  • proof