Robust Federated Finetuning of LLMs via Alternating Optimization of LoRA

Shuangyi Chen; Yuanxin Guo; Yue Ju; Harik Dalal; Zhongwen Zhu; Ashish Khisti

Robust Federated Finetuning of LLMs via Alternating Optimization of LoRA

Shuangyi Chen, Yuanxin Guo, Yue Ju, Harik Dalal, Zhongwen Zhu, Ashish Khisti

TL;DR

RoLoRA tackles inexact model updates in federated LoRA fine-tuning by introducing alternating optimization over the LoRA down- and up-projection matrices, enabling robust, expressive adapters under communication constraints. Theoretical analysis on a linear regressor demonstrates exponential convergence to the global optimum, while non-linear experiments and non-convex convergence guarantees extend these insights to practical models. Empirical results on RoBERTa-Large and Llama-2-7B across GLUE, commonsense reasoning, and generation tasks show RoLoRA consistently outperforms FedAVG-LoRA, FFA-LoRA, and FlexLoRA, especially as the number of clients grows or finetuning budgets shrink. The method halves communication compared to full LoRA baselines and scales to large FL settings, offering a principled, scalable approach to robust federated fine-tuning of large language models.

Abstract

Parameter-Efficient Fine-Tuning (PEFT) methods like Low-Rank Adaptation (LoRA) optimize federated training by reducing computational and communication costs. We propose RoLoRA, a federated framework using alternating optimization to fine-tune LoRA adapters. Our approach emphasizes the importance of learning up and down projection matrices to enhance expressiveness and robustness. We use both theoretical analysis and extensive experiments to demonstrate the advantages of RoLoRA over prior approaches that either generate imperfect model updates or limit expressiveness of the model. We provide a theoretical analysis on a linear model to highlight the importance of learning both the down-projection and up-projection matrices in LoRA. We validate the insights on a non-linear model and separately provide a convergence proof under general conditions. To bridge theory and practice, we conducted extensive experimental evaluations on language models including RoBERTa-Large, Llama-2-7B on diverse tasks and FL settings to demonstrate the advantages of RoLoRA over other methods.

Robust Federated Finetuning of LLMs via Alternating Optimization of LoRA

TL;DR

Abstract

Robust Federated Finetuning of LLMs via Alternating Optimization of LoRA

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (37)