Table of Contents
Fetching ...

FedTLU: Federated Learning with Targeted Layer Updates

Jong-Ik Park, Carlee Joe-Wong

TL;DR

This paper proposes a targeted layer update strategy for fine-tuning in FL that uses a scoring mechanism to identify and update the most critical layers, avoiding excessively noisy or even poisoned updates by freezing the parameters in other layers.

Abstract

Federated learning (FL) addresses privacy concerns in training language models by enabling multiple clients to contribute to the training, without sending their data to others. However, non-IID (identically and independently distributed) data across clients often limits FL's performance. This issue is especially challenging during model fine-tuning, as noise due to variations in clients' data distributions can harm model convergence near stationary points. This paper proposes a targeted layer update strategy for fine-tuning in FL. Instead of randomly updating layers of the language model, as often done in practice, we use a scoring mechanism to identify and update the most critical layers, avoiding excessively noisy or even poisoned updates by freezing the parameters in other layers. We show in extensive experiments that our method improves convergence and performance in non-IID settings, offering a more efficient approach to fine-tuning federated language models.

FedTLU: Federated Learning with Targeted Layer Updates

TL;DR

This paper proposes a targeted layer update strategy for fine-tuning in FL that uses a scoring mechanism to identify and update the most critical layers, avoiding excessively noisy or even poisoned updates by freezing the parameters in other layers.

Abstract

Federated learning (FL) addresses privacy concerns in training language models by enabling multiple clients to contribute to the training, without sending their data to others. However, non-IID (identically and independently distributed) data across clients often limits FL's performance. This issue is especially challenging during model fine-tuning, as noise due to variations in clients' data distributions can harm model convergence near stationary points. This paper proposes a targeted layer update strategy for fine-tuning in FL. Instead of randomly updating layers of the language model, as often done in practice, we use a scoring mechanism to identify and update the most critical layers, avoiding excessively noisy or even poisoned updates by freezing the parameters in other layers. We show in extensive experiments that our method improves convergence and performance in non-IID settings, offering a more efficient approach to fine-tuning federated language models.

Paper Structure

This paper contains 18 sections, 11 equations, 2 figures, 3 tables, 1 algorithm.

Figures (2)

  • Figure 1: Perplexity curves for fine-tuning with Full, FedTLU, Random, and Last-layer updates on Transformer (Penn TB) and GPT-2 (UDPOS) models.
  • Figure 2: Perplexity curves for fine-tuning with Full, FedTLU, and Random updates in noisy or malicious client scenarios.