Communication-Efficient Federated Learning through Adaptive Weight Clustering and Server-Side Distillation

Vasileios Tsouvalas; Aaqib Saeed; Tanir Ozcelebi; Nirvana Meratnia

Communication-Efficient Federated Learning through Adaptive Weight Clustering and Server-Side Distillation

Vasileios Tsouvalas, Aaqib Saeed, Tanir Ozcelebi, Nirvana Meratnia

TL;DR

This paper addresses the high communication cost of federated learning by introducing FedCompress, a two-stage approach combining on-device weight clustering with server-side distillation on out-of-distribution data. The method preserves the standard FL aggregation, requiring no changes to FedAvg, while reducing bidirectional communication and downstream model updates. A representation quality score derived from unlabeled client data guides dynamic adjustment of the number of clusters per layer, enabling adaptation to task complexity. Experimental results across vision and audio datasets show substantial communication cost reductions around $4.5\times$ CCR and model-size reductions around $4.14\times$ MCR with negligible accuracy loss and notable edge-inference speedups up to $1.15\times$ (and $1.24\times$ when quantized).

Abstract

Federated Learning (FL) is a promising technique for the collaborative training of deep neural networks across multiple devices while preserving data privacy. Despite its potential benefits, FL is hindered by excessive communication costs due to repeated server-client communication during training. To address this challenge, model compression techniques, such as sparsification and weight clustering are applied, which often require modifying the underlying model aggregation schemes or involve cumbersome hyperparameter tuning, with the latter not only adjusts the model's compression rate but also limits model's potential for continuous improvement over growing data. In this paper, we propose FedCompress, a novel approach that combines dynamic weight clustering and server-side knowledge distillation to reduce communication costs while learning highly generalizable models. Through a comprehensive evaluation on diverse public datasets, we demonstrate the efficacy of our approach compared to baselines in terms of communication costs and inference speed.

Communication-Efficient Federated Learning through Adaptive Weight Clustering and Server-Side Distillation

TL;DR

CCR and model-size reductions around

MCR with negligible accuracy loss and notable edge-inference speedups up to

(and

when quantized).

Abstract

Paper Structure (6 sections, 2 equations, 2 figures, 2 tables, 1 algorithm)

This paper contains 6 sections, 2 equations, 2 figures, 2 tables, 1 algorithm.

Methodology
Problem Formulation
Federated Model Compression
Evaluation
Conclusion
FedCompress Algorithm

Figures (2)

Figure 1: Illustration of FedCompress for communication-efficient FL. A dual-stage compression scheme is proposed: (i) weight clustering across clients during on-device training, and (ii) self-compression on server-side combining weight clustering with knowledge distillation on out-of-distribution data.
Figure 2: Relationship between mean representation quality score and mean validation accuracy across clients during FL training for FedCompress on CIFAR-10 and SpeechCommmands. Strong positive correlation is observed, indicating that the representation quality score is a useful indicator of the clients models' representational power.

Communication-Efficient Federated Learning through Adaptive Weight Clustering and Server-Side Distillation

TL;DR

Abstract

Communication-Efficient Federated Learning through Adaptive Weight Clustering and Server-Side Distillation

Authors

TL;DR

Abstract

Table of Contents

Figures (2)