Table of Contents
Fetching ...

Private Knowledge Sharing in Distributed Learning: A Survey

Yasas Supeksala, Dinh C. Nguyen, Ming Ding, Thilina Ranbaduge, Calson Chua, Jun Zhang, Jun Li, H. Vincent Poor

TL;DR

This survey addresses private knowledge sharing in distributed learning by mapping the knowledge components shared across supervised, unsupervised, semi-supervised, and deep reinforcement learning architectures. It provides a taxonomy of components (e.g., gradients, logits, latent representations, and policy signals), analyzes vulnerability surfaces and attacker models, and reviews defenses including differential privacy, homomorphic encryption, federated/secure MPC, and privacy-aware architectural designs. The paper highlights attacks such as gradient leakage, model inversion, data poisoning, and policy manipulation, and discusses defenses and their impact on utility and practicality. It further identifies limitations of current schemes, including privacy-utility trade-offs and communication overhead, and suggests future directions for robust, privacy-preserving distributed AI systems across diverse learning paradigms.

Abstract

The rise of Artificial Intelligence (AI) has revolutionized numerous industries and transformed the way society operates. Its widespread use has led to the distribution of AI and its underlying data across many intelligent systems. In this light, it is crucial to utilize information in learning processes that are either distributed or owned by different entities. As a result, modern data-driven services have been developed to integrate distributed knowledge entities into their outcomes. In line with this goal, the latest AI models are frequently trained in a decentralized manner. Distributed learning involves multiple entities working together to make collective predictions and decisions. However, this collaboration can also bring about security vulnerabilities and challenges. This paper provides an in-depth survey on private knowledge sharing in distributed learning, examining various knowledge components utilized in leading distributed learning architectures. Our analysis sheds light on the most critical vulnerabilities that may arise when using these components in a distributed setting. We further identify and examine defensive strategies for preserving the privacy of these knowledge components and preventing malicious parties from manipulating or accessing the knowledge information. Finally, we highlight several key limitations of knowledge sharing in distributed learning and explore potential avenues for future research.

Private Knowledge Sharing in Distributed Learning: A Survey

TL;DR

This survey addresses private knowledge sharing in distributed learning by mapping the knowledge components shared across supervised, unsupervised, semi-supervised, and deep reinforcement learning architectures. It provides a taxonomy of components (e.g., gradients, logits, latent representations, and policy signals), analyzes vulnerability surfaces and attacker models, and reviews defenses including differential privacy, homomorphic encryption, federated/secure MPC, and privacy-aware architectural designs. The paper highlights attacks such as gradient leakage, model inversion, data poisoning, and policy manipulation, and discusses defenses and their impact on utility and practicality. It further identifies limitations of current schemes, including privacy-utility trade-offs and communication overhead, and suggests future directions for robust, privacy-preserving distributed AI systems across diverse learning paradigms.

Abstract

The rise of Artificial Intelligence (AI) has revolutionized numerous industries and transformed the way society operates. Its widespread use has led to the distribution of AI and its underlying data across many intelligent systems. In this light, it is crucial to utilize information in learning processes that are either distributed or owned by different entities. As a result, modern data-driven services have been developed to integrate distributed knowledge entities into their outcomes. In line with this goal, the latest AI models are frequently trained in a decentralized manner. Distributed learning involves multiple entities working together to make collective predictions and decisions. However, this collaboration can also bring about security vulnerabilities and challenges. This paper provides an in-depth survey on private knowledge sharing in distributed learning, examining various knowledge components utilized in leading distributed learning architectures. Our analysis sheds light on the most critical vulnerabilities that may arise when using these components in a distributed setting. We further identify and examine defensive strategies for preserving the privacy of these knowledge components and preventing malicious parties from manipulating or accessing the knowledge information. Finally, we highlight several key limitations of knowledge sharing in distributed learning and explore potential avenues for future research.
Paper Structure (41 sections, 7 figures, 2 tables)

This paper contains 41 sections, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Taxonomy of Distributed Learning Models: The numbered items in each box represent the knowledge components we discuss through the paper.
  • Figure 2: Distributed learning: part "a" illustrates distributed learning achieved through data parallelism while parts "b" to "d" demonstrate various architectures implemented on the fundamental idea of model parallelism.
  • Figure 3: Knowledge sharing in supervised learning
  • Figure 4: Knowledge sharing in unsupervised learning.
  • Figure 5: The Learning architecture of distributed learning through distributed discriminators.
  • ...and 2 more figures