Mosaic Learning: A Framework for Decentralized Learning with Model Fragmentation
Sayan Biswas, Davide Frey, Romaric Gaudel, Nirupam Gupta, Anne-Marie Kermarrec, Dimitri Lerévérend, Rafael Pires, Rishi Sharma, François Taïani, Martijn de Vos
TL;DR
Mosaic Learning addresses decentralized training efficiency by partitioning local models into $K$ fragments and disseminating them through fragment-specific gossip, preserving total communication. It shows that the worst-case convergence rate matches the state-of-the-art $EL$ baseline and, in convex settings, larger fragmentation improves consensus by reducing the contraction factor $ ho(M_t^{\top}M_t)$. Empirically, Mosaic Learning yields up to $12$ percentage-point gains in node-level accuracy under highly heterogeneous data while maintaining EL performance in IID scenarios, validating fragmentation as a first-class primitive for DL. The work thus offers both theoretical guarantees and practical improvements, suggesting fragmentation can enhance scalability and robustness without extra communication costs.
Abstract
Decentralized learning (DL) enables collaborative machine learning (ML) without a central server, making it suitable for settings where training data cannot be centrally hosted. We introduce Mosaic Learning, a DL framework that decomposes models into fragments and disseminates them independently across the network. Fragmentation reduces redundant communication across correlated parameters and enables more diverse information propagation without increasing communication cost. We theoretically show that Mosaic Learning (i) shows state-of-the-art worst-case convergence rate, and (ii) leverages parameter correlation in an ML model, improving contraction by reducing the highest eigenvalue of a simplified system. We empirically evaluate Mosaic Learning on four learning tasks and observe up to 12 percentage points higher node-level test accuracy compared to epidemic learning (EL), a state-of-the-art baseline. In summary, Mosaic Learning improves DL performance without sacrificing its utility or efficiency, and positions itself as a new DL standard.
