Bayesian Federated Model Compression for Communication and Computation Efficiency
Chengyu Xia, Danny H. K. Tsang, Vincent K. N. Lau
TL;DR
This work tackles the dual problem of communication and computation efficiency in federated learning by introducing a Bayesian approach with a hierarchical clustered sparsity prior. It develops a decentralized Turbo-VBI framework (D-Turbo-VBI) that combines SPMP-based message passing on an HMM prior with mean-field variational inference to jointly infer sparse weights across clients, while promoting a common sparse structure. The authors prove convergence to a stationary point under standard assumptions and demonstrate significant gains in communication reduction and local inference cost on CIFAR-10/100 benchmarks. The approach enables cluster-wise transmission and efficient computation through tiled, cluster-based operations, enabling scalable deployment in distributed settings.
Abstract
In this paper, we investigate Bayesian model compression in federated learning (FL) to construct sparse models that can achieve both communication and computation efficiencies. We propose a decentralized Turbo variational Bayesian inference (D-Turbo-VBI) FL framework where we firstly propose a hierarchical sparse prior to promote a clustered sparse structure in the weight matrix. Then, by carefully integrating message passing and VBI with a decentralized turbo framework, we propose the D-Turbo-VBI algorithm which can (i) reduce both upstream and downstream communication overhead during federated training, and (ii) reduce the computational complexity during local inference. Additionally, we establish the convergence property for thr proposed D-Turbo-VBI algorithm. Simulation results show the significant gain of our proposed algorithm over the baselines in reducing communication overhead during federated training and computational complexity of final model.
