Adaptive LoRA Experts Allocation and Selection for Federated Fine-Tuning
Lei Wang, Jieming Bian, Letian Zhang, Jie Xu
TL;DR
The paper tackles the problem of privately fine-tuning large language models under data heterogeneity in federated settings by introducing FedLEASE, a framework that jointly optimizes LoRA expert allocation and adaptive expert usage. It clusters clients based on representation similarity of LoRA adapters, selects the optimal number of experts with silhouette analysis, and employs an adaptive top-$M$ router that guarantees inclusion of each client’s assigned expert while flexibly recruiting others as needed. Empirical results on NLU (GLUE with RoBERTa-Large) and NLG (FLAN with LLaMA-2-7B) show consistent improvements over strong federated LoRA baselines, highlighting gains in both accuracy and generation metrics across task-heterogeneous and label-heterogeneous settings. The approach achieves these benefits with modest communication overhead, demonstrating practical applicability for privacy-preserving, domain-specific fine-tuning of LLMs.
Abstract
Large Language Models (LLMs) have demonstrated impressive capabilities across various tasks, but fine-tuning them for domain-specific applications often requires substantial domain-specific data that may be distributed across multiple organizations. Federated Learning (FL) offers a privacy-preserving solution, but faces challenges with computational constraints when applied to LLMs. Low-Rank Adaptation (LoRA) has emerged as a parameter-efficient fine-tuning approach, though a single LoRA module often struggles with heterogeneous data across diverse domains. This paper addresses two critical challenges in federated LoRA fine-tuning: 1. determining the optimal number and allocation of LoRA experts across heterogeneous clients, and 2. enabling clients to selectively utilize these experts based on their specific data characteristics. We propose FedLEASE (Federated adaptive LoRA Expert Allocation and SElection), a novel framework that adaptively clusters clients based on representation similarity to allocate and train domain-specific LoRA experts. It also introduces an adaptive top-$M$ Mixture-of-Experts mechanism that allows each client to select the optimal number of utilized experts. Our extensive experiments on diverse benchmark datasets demonstrate that FedLEASE significantly outperforms existing federated fine-tuning approaches in heterogeneous client settings while maintaining communication efficiency.
