FedP3: Federated Personalized and Privacy-friendly Network Pruning under Model Heterogeneity

Kai Yi; Nidham Gazagnadou; Peter Richtárik; Lingjuan Lyu

FedP3: Federated Personalized and Privacy-friendly Network Pruning under Model Heterogeneity

Kai Yi, Nidham Gazagnadou, Peter Richtárik, Lingjuan Lyu

TL;DR

FedP3 addresses federated learning under pronounced data and model heterogeneity by integrating both global and local pruning with layer-wise, privacy-friendly communication. It introduces per-client pruning and aggregation schemes, and a local differential privacy variant (0.90LDP-FedP3), supported by convergence analyses showing favorable communication costs relative to unpruned baselines. The framework is validated on CIFAR-10/100, EMNIST-L, and FashionMNIST, with ResNet18 experiments highlighting scalable applicability to larger architectures. Key results include substantial reductions in communication (up to 60% in some setups) with minimal accuracy loss under non-IID distributions, and robust performance across various pruning strategies and aggregation methods. Overall, FedP3 offers a practical, theory-backed path for personalized, privacy-conscious model pruning in heterogeneous FL settings, with implications for efficient deployment in large-scale models and LLM-style architectures.

Abstract

The interest in federated learning has surged in recent research due to its unique ability to train a global model using privacy-secured information held locally on each client. This paper pays particular attention to the issue of client-side model heterogeneity, a pervasive challenge in the practical implementation of FL that escalates its complexity. Assuming a scenario where each client possesses varied memory storage, processing capabilities and network bandwidth - a phenomenon referred to as system heterogeneity - there is a pressing need to customize a unique model for each client. In response to this, we present an effective and adaptable federated framework FedP3, representing Federated Personalized and Privacy-friendly network Pruning, tailored for model heterogeneity scenarios. Our proposed methodology can incorporate and adapt well-established techniques to its specific instances. We offer a theoretical interpretation of FedP3 and its locally differential-private variant, DP-FedP3, and theoretically validate their efficiencies.

FedP3: Federated Personalized and Privacy-friendly Network Pruning under Model Heterogeneity

TL;DR

Abstract

Paper Structure (36 sections, 13 theorems, 68 equations, 6 figures, 6 tables)

This paper contains 36 sections, 13 theorems, 68 equations, 6 figures, 6 tables.

Introduction
Approach
Algorithmic overview.
Local update.
Layer-wise aggregation.
Theoretical Analysis
Experiments
Datasets and Splitting Techniques
Optimal Layer Overlapping Among Clients
Datasets and Models Specifications.
Layer Overlapping Analysis.
Larger Network Verifications.
Key Ablation Studies
Exploring Server to Client Global Pruning Strategies
Exploring Client-Wise Local Pruning Strategies
...and 21 more sections

Key Result

Theorem 1

Let Assumption asm:smoothness holds. Iterations $K$, choose stepsize $\gamma \leq \left\{ 1/L_{\max}, 1/\sqrt{\hat{L}L_{\max} K}\right\}$. Denote $\Delta_0 \coloneqq f(w^0) - f^{\inf}$. Then for any $K\geq 1$, the iterates ${w^k}$ of 0.90FedP3 in Algorithm alg:IST satisfy

Figures (6)

Figure 1: Pipeline illustration of our proposed framework 0.90FedP3.
Figure 2: Comparative Analysis of Layer Overlap Strategies: The left figure presents a comparative study of different overlapping layer configurations across four major datasets. On the right, we extend this comparison to include the state-of-the-art personalized FL method, 0.90FedCR. In this context, S1 refers to a class-wise non-iid distribution, while S2 indicates a Dirichlet non-iid distribution.
Figure 3: ResNet18 architecture.
Figure 4: Comparative Analysis of Server to Client Global Pruning Strategies: The left portion displays Top-1 accuracy across four major datasets and two distinct non-IID distributions, varying with different global pruning rates. On the right, we quantitatively assess the trade-off between model size and accuracy.
Figure 5: Comparison of various model aggregation strategies. $p=0.9$.
...and 1 more figures

Theorems & Definitions (16)

Definition 1: Global Pruning Sketch $\mathbf{P}$
Definition 2: Personalized Model Aggregation Sketch $\mathbf{S}$
Theorem 1: Personalized Model Aggregation
Theorem 2: LDP-FedP3
Theorem 2: Personalized Model Aggregation
Definition 3
Theorem 2: LDP-FedP3
Theorem 3: Global pruning
Lemma 1
Lemma 2
...and 6 more

FedP3: Federated Personalized and Privacy-friendly Network Pruning under Model Heterogeneity

TL;DR

Abstract

FedP3: Federated Personalized and Privacy-friendly Network Pruning under Model Heterogeneity

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (16)