Communication-Efficient Personalized Distributed Learning with Data and Node Heterogeneity
Zhuojun Tian, Zhaoyang Zhang, Yiwei Li, Mehdi Bennis
TL;DR
The paper addresses data and node heterogeneity in decentralized AIoT by introducing the Distributed Strong Lottery Ticket Hypothesis (DSLTH) and a communication-efficient personalized learning framework. Each local model is represented as $\mathbf{v}_i = \mathbf{w} \odot \mathbf{m}_i$ with a fixed global $\mathbf{w}$ and a personalized binary mask $\mathbf{m}_i$, while structured sparsity is promoted via group sparsity regularization. A novel aggregation mechanism uses an intermediate aggregation tensor and a personalized fine-tuning step (MCE-PL) to fuse neighbor information without sacrificing device heterogeneity, and a theoretical DSLTH proof under non-i.i.d. conditions is provided. Empirical results on CIFAR-10 demonstrate improved convergence, personalization, and reduced communication cost across diverse topologies and heterogeneity settings, highlighting practical impact for scalable AIoT deployments.
Abstract
To jointly tackle the challenges of data and node heterogeneity in decentralized learning, we propose a distributed strong lottery ticket hypothesis (DSLTH), based on which a communication-efficient personalized learning algorithm is developed. In the proposed method, each local model is represented as the Hadamard product of global real-valued parameters and a personalized binary mask for pruning. The local model is learned by updating and fusing the personalized binary masks while the real-valued parameters are fixed among different agents. To further reduce the complexity of hardware implementation, we incorporate a group sparse regularization term in the loss function, enabling the learned local model to achieve structured sparsity. Then, a binary mask aggregation algorithm is designed by introducing an intermediate aggregation tensor and adding a personalized fine-tuning step in each iteration, which constrains model updates towards the local data distribution. The proposed method effectively leverages the relativity among agents while meeting personalized requirements in heterogeneous node conditions. We also provide a theoretical proof for the DSLTH, establishing it as the foundation of the proposed method. Numerical simulations confirm the validity of the DSLTH and demonstrate the effectiveness of the proposed algorithm.
