Table of Contents
Fetching ...

P4: Towards private, personalized, and Peer-to-Peer learning

Mohammad Mahdi Maheri, Sandra Siby, Sina Abdollahi, Anastasia Borovykh, Hamed Haddadi

TL;DR

This work tackles data heterogeneity and privacy in decentralized learning by introducing P4, a private, personalized, peer-to-peer framework. P4 forms private groups based on model-weight similarity and conducts DP-enabled co-training within groups using a private and a proxy model, coupled with knowledge distillation to preserve utility under noise. It demonstrates up to 40% accuracy gains over state-of-the-art DP-P2P methods across FEMNIST and CIFAR datasets, with robust results on resource-constrained devices such as Raspberry Pi. The approach offers practical, low-overhead personalization in fully decentralized settings and lays groundwork for extending privacy-preserving P2P learning to broader domains and more robust defenses against adversarial participants.

Abstract

Personalized learning is a proposed approach to address the problem of data heterogeneity in collaborative machine learning. In a decentralized setting, the two main challenges of personalization are client clustering and data privacy. In this paper, we address these challenges by developing P4 (Personalized Private Peer-to-Peer) a method that ensures that each client receives a personalized model while maintaining differential privacy guarantee of each client's local dataset during and after the training. Our approach includes the design of a lightweight algorithm to identify similar clients and group them in a private, peer-to-peer (P2P) manner. Once grouped, we develop differentially-private knowledge distillation for clients to co-train with minimal impact on accuracy. We evaluate our proposed method on three benchmark datasets (FEMNIST or Federated EMNIST, CIFAR-10 and CIFAR-100) and two different neural network architectures (Linear and CNN-based networks) across a range of privacy parameters. The results demonstrate the potential of P4, as it outperforms the state-of-the-art of differential private P2P by up to 40 percent in terms of accuracy. We also show the practicality of P4 by implementing it on resource constrained devices, and validating that it has minimal overhead, e.g., about 7 seconds to run collaborative training between two clients.

P4: Towards private, personalized, and Peer-to-Peer learning

TL;DR

This work tackles data heterogeneity and privacy in decentralized learning by introducing P4, a private, personalized, peer-to-peer framework. P4 forms private groups based on model-weight similarity and conducts DP-enabled co-training within groups using a private and a proxy model, coupled with knowledge distillation to preserve utility under noise. It demonstrates up to 40% accuracy gains over state-of-the-art DP-P2P methods across FEMNIST and CIFAR datasets, with robust results on resource-constrained devices such as Raspberry Pi. The approach offers practical, low-overhead personalization in fully decentralized settings and lays groundwork for extending privacy-preserving P2P learning to broader domains and more robust defenses against adversarial participants.

Abstract

Personalized learning is a proposed approach to address the problem of data heterogeneity in collaborative machine learning. In a decentralized setting, the two main challenges of personalization are client clustering and data privacy. In this paper, we address these challenges by developing P4 (Personalized Private Peer-to-Peer) a method that ensures that each client receives a personalized model while maintaining differential privacy guarantee of each client's local dataset during and after the training. Our approach includes the design of a lightweight algorithm to identify similar clients and group them in a private, peer-to-peer (P2P) manner. Once grouped, we develop differentially-private knowledge distillation for clients to co-train with minimal impact on accuracy. We evaluate our proposed method on three benchmark datasets (FEMNIST or Federated EMNIST, CIFAR-10 and CIFAR-100) and two different neural network architectures (Linear and CNN-based networks) across a range of privacy parameters. The results demonstrate the potential of P4, as it outperforms the state-of-the-art of differential private P2P by up to 40 percent in terms of accuracy. We also show the practicality of P4 by implementing it on resource constrained devices, and validating that it has minimal overhead, e.g., about 7 seconds to run collaborative training between two clients.
Paper Structure (18 sections, 12 equations, 8 figures, 1 table)

This paper contains 18 sections, 12 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Visualization of Model Aggregation in Group $g$: Clients employ proxy and private models, aggregating only the proxy model. In each group, updates are exchanged with one client, serving as an aggregator, which can change during training to distribute communication overhead. After receiving the aggregated model, clients perform local training for personalization.
  • Figure 2: Test Accuracy for Linear model architecture on FEMNIST with $\epsilon=15$ (a) $N=2$ (b) $N=4$ (c) $N=8$
  • Figure 3: Test Accuracy for Linear model architecture on CIFAR-100 with $\epsilon=15$ (a) $N=2$ (b) $N=4$ (c) $N=8$
  • Figure 4: Test Accuracy for CNN model architecture on CIFAR-10 with $\epsilon=15$ (a) $\gamma=25\%$ (b) $\gamma=50\%$ (c) $\gamma=75\%$
  • Figure 5: Test Accuracy for Linear model architecture on CIFAR-10 with $\epsilon=15$ (a) $\gamma=25\%$ (b) $\gamma=50\%$ (c) $\gamma=75\%$
  • ...and 3 more figures