Private Knowledge Sharing in Distributed Learning: A Survey
Yasas Supeksala, Dinh C. Nguyen, Ming Ding, Thilina Ranbaduge, Calson Chua, Jun Zhang, Jun Li, H. Vincent Poor
TL;DR
This survey addresses private knowledge sharing in distributed learning by mapping the knowledge components shared across supervised, unsupervised, semi-supervised, and deep reinforcement learning architectures. It provides a taxonomy of components (e.g., gradients, logits, latent representations, and policy signals), analyzes vulnerability surfaces and attacker models, and reviews defenses including differential privacy, homomorphic encryption, federated/secure MPC, and privacy-aware architectural designs. The paper highlights attacks such as gradient leakage, model inversion, data poisoning, and policy manipulation, and discusses defenses and their impact on utility and practicality. It further identifies limitations of current schemes, including privacy-utility trade-offs and communication overhead, and suggests future directions for robust, privacy-preserving distributed AI systems across diverse learning paradigms.
Abstract
The rise of Artificial Intelligence (AI) has revolutionized numerous industries and transformed the way society operates. Its widespread use has led to the distribution of AI and its underlying data across many intelligent systems. In this light, it is crucial to utilize information in learning processes that are either distributed or owned by different entities. As a result, modern data-driven services have been developed to integrate distributed knowledge entities into their outcomes. In line with this goal, the latest AI models are frequently trained in a decentralized manner. Distributed learning involves multiple entities working together to make collective predictions and decisions. However, this collaboration can also bring about security vulnerabilities and challenges. This paper provides an in-depth survey on private knowledge sharing in distributed learning, examining various knowledge components utilized in leading distributed learning architectures. Our analysis sheds light on the most critical vulnerabilities that may arise when using these components in a distributed setting. We further identify and examine defensive strategies for preserving the privacy of these knowledge components and preventing malicious parties from manipulating or accessing the knowledge information. Finally, we highlight several key limitations of knowledge sharing in distributed learning and explore potential avenues for future research.
