Heterogeneous Federated Fine-Tuning with Parallel One-Rank Adaptation

Zikai Zhang; Rui Hu; Jiahao Xu

Heterogeneous Federated Fine-Tuning with Parallel One-Rank Adaptation

Zikai Zhang, Rui Hu, Jiahao Xu

TL;DR

This work proposes Fed-PLoRA, a novel lightweight heterogeneous federated fine-tuning (FFT) framework, and provides a unified analysis of initialization and aggregation noise of Fed-PLoRA to demonstrate how it addresses the limitations of state-of-the-art methods.

Abstract

Large Language Models (LLMs) have demonstrated remarkable effectiveness in adapting to downstream tasks through fine-tuning. Federated Learning (FL) extends this capability by enabling collaborative fine-tuning across distributed clients using Low-Rank Adaptation (LoRA), while preserving data privacy by avoiding raw data sharing. However, practical deployments face challenges when clients have heterogeneous resources and thus adopt different LoRA ranks, leading to substantial initialization and aggregation noise that undermines performance. To address these challenges, we propose Fed-PLoRA, a novel lightweight heterogeneous federated fine-tuning (FFT) framework. Fed-PLoRA introduces Parallel One-Rank Adaptation (PLoRA), a new LoRA variant that replaces the classic multi-rank LoRA module with multiple parallel one-rank modules, and a novel Select-N-Fold strategy that folds untrained PLoRA modules into the pre-trained weights before local training, thereby accommodating heterogeneous client resources. We provide a unified analysis of initialization and aggregation noise of Fed-PLoRA and demonstrate how it addresses the limitations of state-of-the-art methods. Extensive experiments on diverse LLM fine-tuning tasks demonstrate that Fed-PLoRA consistently outperforms existing methods in both accuracy and efficiency. The code is available at https://github.com/TNI-playground/Fed-PLoRA.

Heterogeneous Federated Fine-Tuning with Parallel One-Rank Adaptation

TL;DR

Abstract

Paper Structure (39 sections, 25 equations, 9 figures, 16 tables, 1 algorithm)

This paper contains 39 sections, 25 equations, 9 figures, 16 tables, 1 algorithm.

Introduction
Federated Fine-Tuning System
Parallel One-Rank Adaptation for Heterogeneous FFT
Motivation
Parallel One-Rank Adaptation
Fed-PLoRA: Heterogeneous FFT with PLoRA
Analysis of Initialization and Aggregation Noise
Communication, computation, and memory overhead
Evaluation
Main Experimental Results
Additional Discussions
Related Work
Conclusion
Code snippet and pseudo-code
Comprehensive Literature Reviews
...and 24 more sections

Figures (9)

Figure 1: Framework of FLoRA, FlexLoRA, HETLoRA, and our method Fed-PLoRA.
Figure 2: Cosine similarities between parallel one-rank modules of PLoRA. (a) Low, random similarities at initialization; (b) Increasing within-rank similarity across clients, while cross-rank similarity remains low, indicating that modules capture distinct knowledge at different ranks but converge on similar knowledge across clients, despite differences in resource limitations and data distributions (see Appendix Section \ref{['appendixsubsection:other_settings']}).
Figure 3: Performance of PLoRA and Fed-PrsLoRA in homogeneous settings.
Figure 4: Select-N-Fold vs. other selection methods.
Figure 5: Training efficiency of Fed-PLoRA.
...and 4 more figures

Heterogeneous Federated Fine-Tuning with Parallel One-Rank Adaptation

TL;DR

Abstract

Heterogeneous Federated Fine-Tuning with Parallel One-Rank Adaptation

Authors

TL;DR

Abstract

Table of Contents

Figures (9)