FedALT: Federated Fine-Tuning through Adaptive Local Training with Rest-of-World LoRA
Jieming Bian, Lei Wang, Letian Zhang, Jie Xu
TL;DR
FedALT tackles the problem of cross-client interference in federated fine-tuning of large language models by decoupling local personalization from global knowledge through two LoRA components per client: an updateable Individual LoRA and a frozen Rest-of-World (RoW) LoRA that aggregates knowledge from other clients. An input-specific adaptive Mixture-of-Experts mixer dynamically weights the two components, enabling personalized adaptation while leveraging global information in a controlled manner. Empirical results on Bloom-560M and Llama 2-7B across diverse NLP tasks show FedALT outperforms FedAvg-based and other personalized federated LoRA methods, with robustness to varying numbers of clients, LoRA ranks, and local epochs. The approach reduces harmful interference, maintains computational efficiency, and offers a practical path for privacy-preserving fine-tuning of heterogeneous client data in real-world NLP applications.
Abstract
Fine-tuning large language models (LLMs) in federated settings enables privacy-preserving adaptation but suffers from cross-client interference due to model aggregation. Existing federated LoRA fine-tuning methods, primarily based on FedAvg, struggle with data heterogeneity, leading to harmful cross-client interference and suboptimal personalization. In this work, we propose \textbf{FedALT}, a novel personalized federated LoRA fine-tuning algorithm that fundamentally departs from FedAvg. Instead of using an aggregated model to initialize local training, each client continues training its individual LoRA while incorporating shared knowledge through a separate Rest-of-World (RoW) LoRA component. To effectively balance local adaptation and global information, FedALT introduces an adaptive mixer that dynamically learns input-specific weightings between the individual and RoW LoRA components, drawing conceptual foundations from the Mixture-of-Experts (MoE) paradigm. Through extensive experiments on NLP benchmarks, we demonstrate that FedALT significantly outperforms state-of-the-art personalized federated LoRA fine-tuning methods, achieving superior local adaptation without sacrificing computational efficiency.
