FL-TAC: Enhanced Fine-Tuning in Federated Learning via Low-Rank, Task-Specific Adapter Clustering

Siqi Ping; Yuzhu Mao; Yang Liu; Xiao-Ping Zhang; Wenbo Ding

FL-TAC: Enhanced Fine-Tuning in Federated Learning via Low-Rank, Task-Specific Adapter Clustering

Siqi Ping, Yuzhu Mao, Yang Liu, Xiao-Ping Zhang, Wenbo Ding

TL;DR

The paper tackles the high communication burden of fine-tuning large pre-trained models in Federated Learning by introducing FL-TAC, a framework that trains extremely low-rank, task-specific adapters on each client and uses server-side clustering to aggregate adapters by downstream task. By assigning a distinct adapter per local task and clustering adapters at the server with $K$-means into $N$ task clusters, FL-TAC enables efficient, task-aware aggregation while reducing transmitted parameters via LoRA-style reparameterization. Empirical results across text generation (Databricks-Dolly-15k), text classification (GLUE), and image classification (CIFAR-10/100) show FL-TAC often outperforms FedIT and reduces communication overhead, with clustering visualizations illustrating evolving, well-separated task clusters during training. The work demonstrates a scalable, multi-task Federated Learning approach that preserves adaptation performance while achieving significant communication savings, and it points to future exploration of the interplay between LoRA rank and base-model capabilities.

Abstract

Although large-scale pre-trained models hold great potential for adapting to downstream tasks through fine-tuning, the performance of such fine-tuned models is often limited by the difficulty of collecting sufficient high-quality, task-specific data. Federated Learning (FL) offers a promising solution by enabling fine-tuning across large-scale clients with a variety of task data, but it is bottlenecked by significant communication overhead due to the pre-trained models' extensive size. This paper addresses the high communication cost for fine-tuning large pre-trained models within FL frameworks through low-rank fine-tuning. Specifically, we train a low-rank adapter for each individual task on the client side, followed by server-side clustering for similar group of adapters to achieve task-specific aggregation. Extensive experiments on various language and vision tasks, such as GLUE and CIFAR-10/100, reveal the evolution of task-specific adapters throughout the FL training process and verify the effectiveness of the proposed low-rank task-specific adapter clustering (TAC) method.

FL-TAC: Enhanced Fine-Tuning in Federated Learning via Low-Rank, Task-Specific Adapter Clustering

TL;DR

-means into

task clusters, FL-TAC enables efficient, task-aware aggregation while reducing transmitted parameters via LoRA-style reparameterization. Empirical results across text generation (Databricks-Dolly-15k), text classification (GLUE), and image classification (CIFAR-10/100) show FL-TAC often outperforms FedIT and reduces communication overhead, with clustering visualizations illustrating evolving, well-separated task clusters during training. The work demonstrates a scalable, multi-task Federated Learning approach that preserves adaptation performance while achieving significant communication savings, and it points to future exploration of the interplay between LoRA rank and base-model capabilities.

Abstract

Paper Structure (16 sections, 2 equations, 4 figures, 2 tables, 3 algorithms)

This paper contains 16 sections, 2 equations, 4 figures, 2 tables, 3 algorithms.

Introduction
Related Work
FL for Fine-tuning Large-scale Pre-trained Models
FL for Multi-Task Learning
Method
Overview of Fine-tuning through FL
Core Concepts and Approach
The FL-TAC Algorithm
Experiment
Experiment Setup
Experiment Results
Performance Analysis for Downstream Tasks
Performance of Clustering at the Central Server
Analysis of Communication Cost and Local Training Cost
Conclusion
...and 1 more sections

Figures (4)

Figure 1: FL-TAC framework.
Figure 2: Visualization of data distribution across clients and performance evaluation on Databricks-Dolly-15k multitask dataset.
Figure 3: Approximation error (measured by MSE) versus LoRA-rank.
Figure 4: Visualization of clustering results from epoch 1 to epoch 9.

FL-TAC: Enhanced Fine-Tuning in Federated Learning via Low-Rank, Task-Specific Adapter Clustering

TL;DR

Abstract

FL-TAC: Enhanced Fine-Tuning in Federated Learning via Low-Rank, Task-Specific Adapter Clustering

Authors

TL;DR

Abstract

Table of Contents

Figures (4)