On the effects of similarity metrics in decentralized deep learning under distributional shift
Edvin Listo Zec, Tom Hagander, Eric Ihre-Thomason, Sarunas Girdzijauskas
TL;DR
This work tackles the challenge of aggregation in decentralized learning under distributional shift by evaluating four similarity metrics to identify beneficial peers for model merging and by introducing FedSim, a similarity-weighted aggregation rule. The authors conduct extensive experiments across concept, covariate, domain, and label shifts on synthetic and real datasets, showing that cosine similarity on weights or gradients often yields robust peer selection and that FedSim can outperform traditional FedAvg, especially under strong cluster divergence. They also demonstrate that inverse empirical loss can be noisy and that the $L^2$ distance is generally weaker, while pre-training and clustering dynamics can complicate similarity signals. The findings provide practical guidance for designing robust, privacy-preserving decentralized learning systems and motivate future work on theory, privacy-preserving similarity measures, and scalable aggregation under non-iid data.
Abstract
Decentralized Learning (DL) enables privacy-preserving collaboration among organizations or users to enhance the performance of local deep learning models. However, model aggregation becomes challenging when client data is heterogeneous, and identifying compatible collaborators without direct data exchange remains a pressing issue. In this paper, we investigate the effectiveness of various similarity metrics in DL for identifying peers for model merging, conducting an empirical analysis across multiple datasets with distribution shifts. Our research provides insights into the performance of these metrics, examining their role in facilitating effective collaboration. By exploring the strengths and limitations of these metrics, we contribute to the development of robust DL methods.
