Table of Contents
Fetching ...

Communication-Efficient Federated Low-Rank Update Algorithm and its Connection to Implicit Regularization

Haemin Park, Diego Klabjan

TL;DR

This work tackles the dual challenges of communication efficiency and performance in federated learning by revealing a rank discrepancy between client and server losses and exploiting low-rank updates. The authors establish that client Hessians exhibit a higher stable rank than the server Hessian, and that low-rank gradient updates align better across clients, motivating a low-rank-constrained optimization. They propose FedLoRU, a general framework that factorizes client updates into low-rank matrices $\bm{A}$ and $\bm{B}$, aggregates them at the server, and periodically accumulates updates to build a higher-rank global model, with convergence matching FedAvg. Empirically, FedLoRU achieves competitive accuracy with substantially reduced communication and demonstrates superior scalability as the number of clients grows, including effectiveness in large-scale FL and LLM fine-tuning scenarios. The work also introduces practical variants for personalization and model heterogeneity, underscoring the practical impact of low-rank updates for robust, scalable federated learning.

Abstract

Federated Learning (FL) faces significant challenges related to communication efficiency and performance reduction when scaling to many clients. To address these issues, we explore the potential of using low-rank updates and provide the first theoretical study of rank properties in FL. Our theoretical analysis shows that a client's loss exhibits a higher-rank structure (i.e., gradients span higher-rank subspaces of the Hessian) compared to the server's loss, and that low-rank approximations of the clients' gradients have greater similarity. Based on this insight, we hypothesize that constraining client-side optimization to a low-rank subspace could provide an implicit regularization effect while reducing communication costs. Consequently, we propose FedLoRU, a general low-rank update framework for FL. Our framework enforces low-rank client-side updates and accumulates these updates to form a higher-rank model. We are able to establish convergence of the algorithm; the convergence rate matches FedAvg. Additionally, variants of FedLoRU can adapt to environments with statistical and model heterogeneity by employing multiple or hierarchical low-rank updates. Experimental results demonstrate that FedLoRU performs comparably to full-rank algorithms and exhibits robustness to heterogeneous and large numbers of clients.

Communication-Efficient Federated Low-Rank Update Algorithm and its Connection to Implicit Regularization

TL;DR

This work tackles the dual challenges of communication efficiency and performance in federated learning by revealing a rank discrepancy between client and server losses and exploiting low-rank updates. The authors establish that client Hessians exhibit a higher stable rank than the server Hessian, and that low-rank gradient updates align better across clients, motivating a low-rank-constrained optimization. They propose FedLoRU, a general framework that factorizes client updates into low-rank matrices and , aggregates them at the server, and periodically accumulates updates to build a higher-rank global model, with convergence matching FedAvg. Empirically, FedLoRU achieves competitive accuracy with substantially reduced communication and demonstrates superior scalability as the number of clients grows, including effectiveness in large-scale FL and LLM fine-tuning scenarios. The work also introduces practical variants for personalization and model heterogeneity, underscoring the practical impact of low-rank updates for robust, scalable federated learning.

Abstract

Federated Learning (FL) faces significant challenges related to communication efficiency and performance reduction when scaling to many clients. To address these issues, we explore the potential of using low-rank updates and provide the first theoretical study of rank properties in FL. Our theoretical analysis shows that a client's loss exhibits a higher-rank structure (i.e., gradients span higher-rank subspaces of the Hessian) compared to the server's loss, and that low-rank approximations of the clients' gradients have greater similarity. Based on this insight, we hypothesize that constraining client-side optimization to a low-rank subspace could provide an implicit regularization effect while reducing communication costs. Consequently, we propose FedLoRU, a general low-rank update framework for FL. Our framework enforces low-rank client-side updates and accumulates these updates to form a higher-rank model. We are able to establish convergence of the algorithm; the convergence rate matches FedAvg. Additionally, variants of FedLoRU can adapt to environments with statistical and model heterogeneity by employing multiple or hierarchical low-rank updates. Experimental results demonstrate that FedLoRU performs comparably to full-rank algorithms and exhibits robustness to heterogeneous and large numbers of clients.
Paper Structure (57 sections, 13 theorems, 109 equations, 7 figures, 7 tables, 4 algorithms)

This paper contains 57 sections, 13 theorems, 109 equations, 7 figures, 7 tables, 4 algorithms.

Key Result

Proposition 3.1

Let $\bm{H}_N^R$ defined as in (eqn:additive_perturbed_model). If $\lambda_i(\bm{H}_N^R)$ denotes the $i$-th eigenvalue of $\bm{H}_N^R$, then for $i=1, \cdots, p$, the following holds: as $R \to \infty$, and for $i=0, \cdots, q-1$, we have Here, $g^{-1}_N(\theta) = \theta + \frac{\sigma^2 s_N^2}{\theta}$, $U_N = 2\sigma s_N$, and $L_N = -2\sigma s_N$. In addition, for $p < i \leq P-q$, we have $

Figures (7)

  • Figure 1: The estimated stable ranks of the Hessians are compared for dataset sizes of 50 and 500 (averaged over multiple runs). For details of the experiment, see Appendix \ref{['appx: detail_esr']}
  • Figure 2: The relative difference in test accuracy between two algorithms is measured by the number of clients. The relative difference of $\text{Alg}_1$ to $\text{Alg}_2$ is defined as $\frac{\text{Alg}_1 - \text{Alg}_2}{\text{Alg}_1}$.
  • Figure 3: Communication cost of low-rank FL methods to reach target accuracy (X: not reached).
  • Figure 5: The test accuracy curves for FMNIST under an IID setting with K=20 and K=100.
  • Figure 6: The test accuracy curves for CIFAR-10 under an IID setting with K=20 and K=100.
  • ...and 2 more figures

Theorems & Definitions (23)

  • Proposition 3.1: Limiting eigenvalues of $\bm{H}_N^R$ (modified from baskerville2022universal)
  • Theorem 3.2
  • Theorem 3.3
  • Theorem 4.2: Convergence of FedLoRU
  • Lemma A.1: Theorem 2.2 from pielaszkiewicz2015closed
  • Lemma A.2: cf. capitaine2013additive
  • Lemma A.3: Weyl's inequality
  • Proposition A.4
  • proof
  • proof
  • ...and 13 more