Table of Contents
Fetching ...

Optimizing Multilingual LLMs via Federated Learning: A Study of Client Language Composition

Aleix Sant, Jordi Luque, Carlos Escolano

Abstract

Federated Learning (FL) of Large Language Models (LLMs) in multilingual environments presents significant challenges stemming from heterogeneous language distributions across clients and disparities in language resource availability. To address these challenges, we extended the FederatedScope-LLM framework to support multilingual instruction-tuning experiments with LLMs. We also introduced a novel client-specific early stopping mechanism, Local Dynamic Early Stopping (LDES-FL), which allows clients to pause and resume local training based on client-side validation performance, enhancing training efficiency and sustainability. Through a series of experiments, we studied how client language composition - from fully monolingual to increasingly multilingual clients - affects multilingual quality, fairness and training cost. Monolingual local fine-tuning remains the most effective for single-language specialization, whereas federated training is better suited to learning a single balanced multilingual model. In FL, increasing within-client multilinguality leads to stronger and fairer global models, narrows the gap to centralized multilingual fine-tuning, and yields the largest gains for lower-resource languages, albeit at the cost of more optimization steps. Overall, our results identify client language composition as a key design variable in multilingual FL, shaping performance, fairness and efficiency.

Optimizing Multilingual LLMs via Federated Learning: A Study of Client Language Composition

Abstract

Federated Learning (FL) of Large Language Models (LLMs) in multilingual environments presents significant challenges stemming from heterogeneous language distributions across clients and disparities in language resource availability. To address these challenges, we extended the FederatedScope-LLM framework to support multilingual instruction-tuning experiments with LLMs. We also introduced a novel client-specific early stopping mechanism, Local Dynamic Early Stopping (LDES-FL), which allows clients to pause and resume local training based on client-side validation performance, enhancing training efficiency and sustainability. Through a series of experiments, we studied how client language composition - from fully monolingual to increasingly multilingual clients - affects multilingual quality, fairness and training cost. Monolingual local fine-tuning remains the most effective for single-language specialization, whereas federated training is better suited to learning a single balanced multilingual model. In FL, increasing within-client multilinguality leads to stronger and fairer global models, narrows the gap to centralized multilingual fine-tuning, and yields the largest gains for lower-resource languages, albeit at the cost of more optimization steps. Overall, our results identify client language composition as a key design variable in multilingual FL, shaping performance, fairness and efficiency.
Paper Structure (15 sections, 3 equations, 4 figures, 5 tables, 1 algorithm)

This paper contains 15 sections, 3 equations, 4 figures, 5 tables, 1 algorithm.

Figures (4)

  • Figure 1: Illustration of a multilingual FL setup. Each client primarily contains data from a single dominant language (represented by the tallest bar) along with smaller, equal portions of data from the remaining languages. Each color corresponds to a different language. As in our experiments, there are eight clients in total.
  • Figure 2: Training evolution of clients using LDES-FL with FedAvg, where each client holds data in a different language (100% mono).
  • Figure 3: Validation loss across clients under standard FedAvg (100% mono). Low-resource languages show higher validation loss, whereas high-resource languages achieve lower loss. All clients improve at a similar rate during federated training. Shaded regions mark rounds where a client is stopped.
  • Figure 4: Training evolution of clients using LDES-FL with FedAvg in the 50% mono setting. Note that clients are not labeled by language as in Figure \ref{['fig:clients_evolution']}, as here each client contains a mix of languages.