Table of Contents
Fetching ...

ILoRA: Federated Learning with Low-Rank Adaptation for Heterogeneous Client Aggregation

Junchao Zhou, Junkang Liu, Fanhua Shang

TL;DR

ILoRA is proposed, a unified framework that integrates three core innovations: a QR-based orthonormal initialization to ensure all clients start in a coherent subspace; a Concatenated QR Aggregation mechanism that fuses heterogeneous-rank updates via concatenation and decomposition, preserving information while maintaining dimension alignment.

Abstract

Federated Learning with Low-Rank Adaptation (LoRA) faces three critical challenges under client heterogeneity: (1) Initialization-Induced Instability due to random initialization misaligning client subspaces; (2) Rank Incompatibility and Aggregation Error when averaging LoRA parameters of different ranks, which biases the global model; and (3) exacerbated Client Drift under Non-IID Data, impairing generalization. To address these challenges, we propose ILoRA, a unified framework that integrates three core innovations: a QR-based orthonormal initialization to ensure all clients start in a coherent subspace; a Concatenated QR Aggregation mechanism that fuses heterogeneous-rank updates via concatenation and decomposition, preserving information while maintaining dimension alignment; and an AdamW optimizer with rank-aware control variates to correct local updates and mitigate client drift. Supported by theoretical convergence guarantees, extensive experiments on vision and NLP benchmarks demonstrate that ILoRA consistently achieves superior accuracy and convergence stability compared to existing federated LoRA methods.

ILoRA: Federated Learning with Low-Rank Adaptation for Heterogeneous Client Aggregation

TL;DR

ILoRA is proposed, a unified framework that integrates three core innovations: a QR-based orthonormal initialization to ensure all clients start in a coherent subspace; a Concatenated QR Aggregation mechanism that fuses heterogeneous-rank updates via concatenation and decomposition, preserving information while maintaining dimension alignment.

Abstract

Federated Learning with Low-Rank Adaptation (LoRA) faces three critical challenges under client heterogeneity: (1) Initialization-Induced Instability due to random initialization misaligning client subspaces; (2) Rank Incompatibility and Aggregation Error when averaging LoRA parameters of different ranks, which biases the global model; and (3) exacerbated Client Drift under Non-IID Data, impairing generalization. To address these challenges, we propose ILoRA, a unified framework that integrates three core innovations: a QR-based orthonormal initialization to ensure all clients start in a coherent subspace; a Concatenated QR Aggregation mechanism that fuses heterogeneous-rank updates via concatenation and decomposition, preserving information while maintaining dimension alignment; and an AdamW optimizer with rank-aware control variates to correct local updates and mitigate client drift. Supported by theoretical convergence guarantees, extensive experiments on vision and NLP benchmarks demonstrate that ILoRA consistently achieves superior accuracy and convergence stability compared to existing federated LoRA methods.

Paper Structure

This paper contains 46 sections, 42 theorems, 71 equations, 15 figures, 18 tables, 4 algorithms.

Key Result

Theorem 1

Under Assumptions ass:smoothness--ass:subspace, with $\eta_l \!\leq\! 1/L$ and $\eta_g \eta_l \!=\! \Theta(1/\!\sqrt{\!SKT})$: where $\epsilon_r \!=\! (r_{\max} \!-\! r_s)^2$, $S$: clients/round, $K$: local steps, $T$: total rounds.

Figures (15)

  • Figure 1: Comparison of federated LoRA methods: FedIT (aggregation error and information loss) and ILoRA (ours) (both correct aggregation and aligned initialization).
  • Figure 2: Overview of ILoRA: Clients fine-tune LoRA modules locally; the server aggregates updates via concatenated QR decomposition into a global orthogonal basis $(Q, R)$, enabling efficient communication, subspace alignment, and drift mitigation under Non-IID data.
  • Figure 3: Federated learning model updates with alignment correction. Local models (Client 1 and 2) are guided by corrections, leading the global model to converge near the true optima.
  • Figure 4: Performance comparison across settings. (a-d) CV tasks with ViT-Base/Swin-Base; (e-f) NLP tasks with RoBERTa.
  • Figure 5: Performance comparison under Non-IID settings: (a) Centralized vs. federated learning on CIFAR-10 (C10), CIFAR-100 (C100), and Tiny-ImageNet (Tiny) with $\alpha=0.3$; (b) AGNews dataset across different heterogeneity levels ($\alpha = 0.5, 0.6, 0.7$).
  • ...and 10 more figures

Theorems & Definitions (86)

  • Theorem 1: Convergence of ILoRA
  • proof : Proof Sketch
  • Theorem 2: Subspace Preservation under QR Compression
  • Theorem 3: Consistent Subspace Initialization
  • Theorem 3: Consistent Subspace Initialization
  • Theorem 4: Convergence with Control Variates and AdamW
  • proof : Proof Sketch
  • Theorem 5: Convergence of ILoRA
  • proof
  • Remark 1
  • ...and 76 more