Table of Contents
Fetching ...

Towards Privacy-Preserving and Heterogeneity-aware Split Federated Learning via Probabilistic Masking

Xingchen Wang, Feijie Wu, Chenglin Miao, Tianchun Li, Haoyu Hu, Qiming Cao, Jing Gao, Lu Su

TL;DR

PM-SFL tackles privacy in split federated learning by introducing probabilistic masking that injects structured randomness without explicit noise. It addresses data heterogeneity with personalized masks and system heterogeneity with layer-wise knowledge compensation, enabling adaptive splitting. The authors provide theoretical privacy analyses, including a data reconstruction lower bound and DP amplification, and demonstrate empirical gains in accuracy, communication efficiency, and robustness against privacy attacks across image and wireless sensing tasks.

Abstract

Split Federated Learning (SFL) has emerged as an efficient alternative to traditional Federated Learning (FL) by reducing client-side computation through model partitioning. However, exchanging of intermediate activations and model updates introduces significant privacy risks, especially from data reconstruction attacks that recover original inputs from intermediate representations. Existing defenses using noise injection often degrade model performance. To overcome these challenges, we present PM-SFL, a scalable and privacy-preserving SFL framework that incorporates Probabilistic Mask training to add structured randomness without relying on explicit noise. This mitigates data reconstruction risks while maintaining model utility. To address data heterogeneity, PM-SFL employs personalized mask learning that tailors submodel structures to each client's local data. For system heterogeneity, we introduce a layer-wise knowledge compensation mechanism, enabling clients with varying resources to participate effectively under adaptive model splitting. Theoretical analysis confirms its privacy protection, and experiments on image and wireless sensing tasks demonstrate that PM-SFL consistently improves accuracy, communication efficiency, and robustness to privacy attacks, with particularly strong performance under data and system heterogeneity.

Towards Privacy-Preserving and Heterogeneity-aware Split Federated Learning via Probabilistic Masking

TL;DR

PM-SFL tackles privacy in split federated learning by introducing probabilistic masking that injects structured randomness without explicit noise. It addresses data heterogeneity with personalized masks and system heterogeneity with layer-wise knowledge compensation, enabling adaptive splitting. The authors provide theoretical privacy analyses, including a data reconstruction lower bound and DP amplification, and demonstrate empirical gains in accuracy, communication efficiency, and robustness against privacy attacks across image and wireless sensing tasks.

Abstract

Split Federated Learning (SFL) has emerged as an efficient alternative to traditional Federated Learning (FL) by reducing client-side computation through model partitioning. However, exchanging of intermediate activations and model updates introduces significant privacy risks, especially from data reconstruction attacks that recover original inputs from intermediate representations. Existing defenses using noise injection often degrade model performance. To overcome these challenges, we present PM-SFL, a scalable and privacy-preserving SFL framework that incorporates Probabilistic Mask training to add structured randomness without relying on explicit noise. This mitigates data reconstruction risks while maintaining model utility. To address data heterogeneity, PM-SFL employs personalized mask learning that tailors submodel structures to each client's local data. For system heterogeneity, we introduce a layer-wise knowledge compensation mechanism, enabling clients with varying resources to participate effectively under adaptive model splitting. Theoretical analysis confirms its privacy protection, and experiments on image and wireless sensing tasks demonstrate that PM-SFL consistently improves accuracy, communication efficiency, and robustness to privacy attacks, with particularly strong performance under data and system heterogeneity.

Paper Structure

This paper contains 31 sections, 6 theorems, 59 equations, 7 figures, 3 tables, 1 algorithm.

Key Result

Theorem 1

Suppose there is a deep neural network with the parameters of $\mathbf{w}_b \in \mathbb{R}^{n \times m}$, i.e., $f_{\mathbf{w}_b}: \mathbb{R}^n \rightarrow \mathbb{R}^m$, and $\mathbf{w}_b$ is a non-singular matrix. The raw input $x \in \mathbb{R}^n$ with a randomly generated mask $M$ yields a smash $\lambda(\cdot)$ indicates the singular value of a matrix. $\hat{M}$ is a randomly generated mask w

Figures (7)

  • Figure 1: Illustration of typical FL and SFL.
  • Figure 2: Visualization of reconstructed data for vanilla SFL (SplitFed) and noise-injection-based variants (PixelDP) under different privacy budgets on CIFAR-100 using ResNet-18.
  • Figure 3: Probabilistic Mask Training in SFL.
  • Figure 4: Performance of Probabilistic Mask Training on CIFAR-100 using ResNet-18.
  • Figure 5: Privacy--Accuracy Tradeoff Under Reconstruction Attacks on CIFAR-100 Using ResNet-18.
  • ...and 2 more figures

Theorems & Definitions (9)

  • Theorem 1
  • Definition 1: Adjacent Datasets
  • Definition 2: $(\epsilon, \delta)$-DP
  • Definition 3: $(\epsilon, \delta)$-DP
  • Theorem 2: Privacy Guarantee for Smashed Data Forwarding
  • Theorem 3: Privacy Guarantee for Mask Aggregation
  • Lemma 1: dwork2014algorithmic
  • Lemma 2
  • Lemma 3