FedRef: Communication-Efficient Bayesian Fine-Tuning using a Reference Model

Taehwan Yoon; Bongjun Choi; Wesley De Neve

FedRef: Communication-Efficient Bayesian Fine-Tuning using a Reference Model

Taehwan Yoon, Bongjun Choi, Wesley De Neve

TL;DR

This paper tackles the dual challenge in federated learning of data heterogeneity and catastrophic forgetting, proposing FedRef, a communication-efficient Bayesian fine-tuning method that uses a reference model to stabilize global updates. FedRef formulates server-side MAP optimization with a prior derived from a reference (temporal) model and likelihood contributions from participating clients, effectively incorporating past global features through a Laplace-approximated prior. By replacing client-side proximal computations with a server-side reference-based regularization, FedRef achieves higher predictive performance with fewer communication rounds across FEMNIST, CINIC-10, and FeTS2022, while keeping client computation low. The work offers a practical, scalable approach for privacy-preserving FL that maintains performance in non-IID settings and lays groundwork for adaptive reference schemes and alternative Fisher information formulations.

Abstract

Federated learning (FL) collaboratively trains artificial intelligence (AI) models to ensure user data privacy. Sharing only model updates generated from local training on client data with the server enhances user data privacy. However, model performance may suffer due to data and system heterogeneity among clients in FL scenarios. Previous studies have proposed model optimization, fine-tuning, and personalization to achieve improved model performance. Despite these efforts, models resulting from FL scenarios often exhibit catastrophic forgetting, which increases the communication and computational costs of clients for model optimization and raises energy consumption. To address these challenges, we propose a reference model-based fine-tuning method for federated learning that overcomes catastrophic forgetting in each round. Our method is derived from Bayesian parameter-efficient transfer learning and includes an proximal term. It employs a reference model that incorporates previous model parameters and reviews previous global features in the model optimization step to mitigate catastrophic forgetting. As a result, our method achieves higher model performance and lower communication and computational costs for clients than existing methods.

FedRef: Communication-Efficient Bayesian Fine-Tuning using a Reference Model

TL;DR

Abstract

FedRef: Communication-Efficient Bayesian Fine-Tuning using a Reference Model

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)