Table of Contents
Fetching ...

Preventing Rank Collapse in Federated Low-Rank Adaptation with Client Heterogeneity

Fei Wu, Jia Hu, Geyong Min, Shiqiang Wang

Abstract

Federated low-rank adaptation (FedLoRA) has facilitated communication-efficient and privacy-preserving fine-tuning of foundation models for downstream tasks. In practical federated learning scenarios, client heterogeneity in system resources and data distributions motivates heterogeneous LoRA ranks across clients. We identify a previously overlooked phenomenon in heterogeneous FedLoRA, termed rank collapse, where the energy of the global update concentrates on the minimum shared rank, resulting in suboptimal performance and high sensitivity to rank configurations. Through theoretical analysis, we reveal the root cause of rank collapse: a mismatch between rank-agnostic aggregation weights and rank-dependent client contributions, which systematically suppresses higher-rank updates at a geometric rate over rounds. Motivated by this insight, we propose raFLoRA, a rank-partitioned aggregation method that decomposes local updates into rank partitions and then aggregates each partition weighted by its effective client contributions. Extensive experiments across classification and reasoning tasks show that raFLoRA prevents rank collapse, improves model performance, and preserves communication efficiency compared to state-of-the-art FedLoRA baselines.

Preventing Rank Collapse in Federated Low-Rank Adaptation with Client Heterogeneity

Abstract

Federated low-rank adaptation (FedLoRA) has facilitated communication-efficient and privacy-preserving fine-tuning of foundation models for downstream tasks. In practical federated learning scenarios, client heterogeneity in system resources and data distributions motivates heterogeneous LoRA ranks across clients. We identify a previously overlooked phenomenon in heterogeneous FedLoRA, termed rank collapse, where the energy of the global update concentrates on the minimum shared rank, resulting in suboptimal performance and high sensitivity to rank configurations. Through theoretical analysis, we reveal the root cause of rank collapse: a mismatch between rank-agnostic aggregation weights and rank-dependent client contributions, which systematically suppresses higher-rank updates at a geometric rate over rounds. Motivated by this insight, we propose raFLoRA, a rank-partitioned aggregation method that decomposes local updates into rank partitions and then aggregates each partition weighted by its effective client contributions. Extensive experiments across classification and reasoning tasks show that raFLoRA prevents rank collapse, improves model performance, and preserves communication efficiency compared to state-of-the-art FedLoRA baselines.
Paper Structure (38 sections, 1 theorem, 44 equations, 9 figures, 9 tables, 1 algorithm)

This paper contains 38 sections, 1 theorem, 44 equations, 9 figures, 9 tables, 1 algorithm.

Key Result

Theorem 4.4

Let $\rho_{r_1}^{(t)} = \frac{\sum_{i=1}^{r_1} e_i^{(t)}}{\sum_{j=1}^{r_{\max}} e_j^{(t)}}$ denote the cumulative expected energy ratio of the global update within the shared rank $r_1$ at round $t$. Then the effective rank of the global update collapses to $r_1$ at a geometric rate. Specifically, where the initial energy imbalance constant $C$ and the convergence rate $\gamma$ are given by Her

Figures (9)

  • Figure 1: FedLoRA with heterogeneous ranks. The global update is aggregated and allocated across clients with different ranks.
  • Figure 2: Energy breakdown of the global update and accuracy under various settings. In (a) and (b), the global update has an algebraic rank of 64. Client ranks are selected from $\{8,16,32,48,64\}$. In (c), $\alpha$ controls the degree of data heterogeneity across clients. In (d), only the minimal rank $r_1$ varies across different ranks.
  • Figure 3: Illustration of rank collapse. The shared-rank directions are fully aggregated, while the higher-rank directions are diluted.
  • Figure 4: Illustration of the rank-partitioned aggregation. Each partition is weighted by its effective participating clients.
  • Figure 5: Global evaluation accuracy and global training loss over communication rounds. Results on CIFAR100 with ViT-base are shown in (a) and (b), while results on GSM8K with LLaMA-3.2-3B are shown in (c) and (d).
  • ...and 4 more figures

Theorems & Definitions (3)

  • Definition 4.1: Rank Collapse
  • Theorem 4.4: Rank Collapse in vanilla FedLoRA
  • proof