Relaxed Contrastive Learning for Federated Learning
Seonguk Seo, Jinkyu Kim, Geeho Kim, Bohyung Han
TL;DR
This work tackles gradient inconsistency in federated learning caused by client data heterogeneity by linking local gradient deviations to the distribution of feature representations. It shows that supervised contrastive learning (SCL) can mitigate deviations but induces representation collapse, hindering transferability. To address this, the authors introduce FedRCL, a relaxed supervised contrastive loss with a divergence penalty and a multi-level extension that promotes diversity across intermediate representations, thereby improving transferability and convergence. Empirical results across CIFAR-10/100 and Tiny-ImageNet under diverse non-iid settings demonstrate that FedRCL outperforms strong baselines and remains robust under low participation and varying backbones, with seamless compatibility with server-side optimization techniques. The approach offers a practical, privacy-preserving way to enhance collaborative learning in heterogeneous FL environments.
Abstract
We propose a novel contrastive learning framework to effectively address the challenges of data heterogeneity in federated learning. We first analyze the inconsistency of gradient updates across clients during local training and establish its dependence on the distribution of feature representations, leading to the derivation of the supervised contrastive learning (SCL) objective to mitigate local deviations. In addition, we show that a naïve adoption of SCL in federated learning leads to representation collapse, resulting in slow convergence and limited performance gains. To address this issue, we introduce a relaxed contrastive learning loss that imposes a divergence penalty on excessively similar sample pairs within each class. This strategy prevents collapsed representations and enhances feature transferability, facilitating collaborative training and leading to significant performance improvements. Our framework outperforms all existing federated learning approaches by huge margins on the standard benchmarks through extensive experimental results.
