Communication-Efficient Federated Low-Rank Update Algorithm and its Connection to Implicit Regularization
Haemin Park, Diego Klabjan
TL;DR
This work tackles the dual challenges of communication efficiency and performance in federated learning by revealing a rank discrepancy between client and server losses and exploiting low-rank updates. The authors establish that client Hessians exhibit a higher stable rank than the server Hessian, and that low-rank gradient updates align better across clients, motivating a low-rank-constrained optimization. They propose FedLoRU, a general framework that factorizes client updates into low-rank matrices $\bm{A}$ and $\bm{B}$, aggregates them at the server, and periodically accumulates updates to build a higher-rank global model, with convergence matching FedAvg. Empirically, FedLoRU achieves competitive accuracy with substantially reduced communication and demonstrates superior scalability as the number of clients grows, including effectiveness in large-scale FL and LLM fine-tuning scenarios. The work also introduces practical variants for personalization and model heterogeneity, underscoring the practical impact of low-rank updates for robust, scalable federated learning.
Abstract
Federated Learning (FL) faces significant challenges related to communication efficiency and performance reduction when scaling to many clients. To address these issues, we explore the potential of using low-rank updates and provide the first theoretical study of rank properties in FL. Our theoretical analysis shows that a client's loss exhibits a higher-rank structure (i.e., gradients span higher-rank subspaces of the Hessian) compared to the server's loss, and that low-rank approximations of the clients' gradients have greater similarity. Based on this insight, we hypothesize that constraining client-side optimization to a low-rank subspace could provide an implicit regularization effect while reducing communication costs. Consequently, we propose FedLoRU, a general low-rank update framework for FL. Our framework enforces low-rank client-side updates and accumulates these updates to form a higher-rank model. We are able to establish convergence of the algorithm; the convergence rate matches FedAvg. Additionally, variants of FedLoRU can adapt to environments with statistical and model heterogeneity by employing multiple or hierarchical low-rank updates. Experimental results demonstrate that FedLoRU performs comparably to full-rank algorithms and exhibits robustness to heterogeneous and large numbers of clients.
