Table of Contents
Fetching ...

Task-Centric Personalized Federated Fine-Tuning of Language Models

Gabriel U. Talasso, Meghdad Kurmanji, Allan M. de Souza, Nicholas D. Lane, Leandro A. Villas

Abstract

Federated Learning (FL) has emerged as a promising technique for training language models on distributed and private datasets of diverse tasks. However, aggregating models trained on heterogeneous tasks often degrades the overall performance of individual clients. To address this issue, Personalized FL (pFL) aims to create models tailored for each client's data distribution. Although these approaches improve local performance, they usually lack robustness in two aspects: (i) generalization: when clients must make predictions on unseen tasks, or face changes in their data distributions, and (ii) intra-client tasks interference: when a single client's data contains multiple distributions that may interfere with each other during local training. To tackle these two challenges, we propose FedRouter, a clustering-based pFL that builds specialized models for each task rather than for each client. FedRouter uses adapters to personalize models by employing two clustering mechanisms to associate adapters with specific tasks. A local clustering that associate adapters with task data samples and a global one that associates similar adapters from different clients to construct task-centric personalized models. Additionally, we propose an evaluation router mechanism that routes test samples to the best adapter based on the created clusters. Experiments comparing our method with existing approaches across a multitask dataset, FedRouter demonstrate strong resilience in these challenging scenarios performing up to 6.1% relatively better under tasks interference and up to 136% relative improvement under generalization evaluation.

Task-Centric Personalized Federated Fine-Tuning of Language Models

Abstract

Federated Learning (FL) has emerged as a promising technique for training language models on distributed and private datasets of diverse tasks. However, aggregating models trained on heterogeneous tasks often degrades the overall performance of individual clients. To address this issue, Personalized FL (pFL) aims to create models tailored for each client's data distribution. Although these approaches improve local performance, they usually lack robustness in two aspects: (i) generalization: when clients must make predictions on unseen tasks, or face changes in their data distributions, and (ii) intra-client tasks interference: when a single client's data contains multiple distributions that may interfere with each other during local training. To tackle these two challenges, we propose FedRouter, a clustering-based pFL that builds specialized models for each task rather than for each client. FedRouter uses adapters to personalize models by employing two clustering mechanisms to associate adapters with specific tasks. A local clustering that associate adapters with task data samples and a global one that associates similar adapters from different clients to construct task-centric personalized models. Additionally, we propose an evaluation router mechanism that routes test samples to the best adapter based on the created clusters. Experiments comparing our method with existing approaches across a multitask dataset, FedRouter demonstrate strong resilience in these challenging scenarios performing up to 6.1% relatively better under tasks interference and up to 136% relative improvement under generalization evaluation.

Paper Structure

This paper contains 16 sections, 8 figures, 1 table.

Figures (8)

  • Figure 1: FedRouter Workflow Overview. Each client first computes embeddings from its local data and applies clustering to partition the dataset into task-specific subsets. The client then sends the resulting centroids and adapters to the server, which performs global clustering to associate similar tasks across clients and aggregate their corresponding adapters collaboratively. Finally, the server sends the updated adapters back to the clients, which then associate each received model with the appropriate local task-specific dataset for next round of training.
  • Figure 2: FedRouter Evaluation Modes. During inference, each client computes the embedding of a new data sample and associates it with the nearest centroid based on the minimum Euclidean distance. The association can be performed using either the local centroids, to obtain a personalized evaluation, or the global centroids, to enable a generalized evaluation across the federation.
  • Figure 3: Performance comparison (mean $\pm$ std) in the single-task training scenario, evaluated on all tasks at test time to assess generalization capability and robustness under test-time distribution shift.
  • Figure 4: t-SNE visualization of client test data embeddings in single scenario.
  • Figure 5: Scaling model size of Llama models using FedRouter in single scenario.
  • ...and 3 more figures