FedDr+: Stabilizing Dot-regression with Global Feature Distillation for Federated Learning
Seongyoon Kim, Minchan Jeong, Sungnyun Kim, Sungwoo Cho, Sumyeong Ahn, Se-Young Yun
TL;DR
Federated learning under non-IID data suffers from client drift, especially when the last classifier layer is misaligned with heterogeneous feature extractors. The paper investigates dot-regression loss ($\mathcal{L}_{DR}$) with a frozen simplex ETF classifier, finding strong local alignment but poor handling of unseen classes, which hurts the global model. To address this, FedDr+ introduces a feature distillation loss ($\mathcal{L}_{FD}$) and forms $\mathcal{L}_{Dr+} = \beta \mathcal{L}_{DR} + (1-\beta) \mathcal{L}_{FD}$, preserving global knowledge while maintaining alignment. Empirical results on CIFAR-10/100 demonstrate that FedDr+ achieves superior performance in both global and personalized FL across diverse non-IID settings, validating the approach and highlighting its robustness and practical impact for real-world FL deployments. The method advances the stability and generalization of FL by preventing forgetting of unseen classes during local updates while retaining fast feature alignment.
Abstract
Federated Learning (FL) has emerged as a pivotal framework for the development of effective global models (global FL) or personalized models (personalized FL) across clients with heterogeneous, non-iid data distribution. A key challenge in FL is client drift, where data heterogeneity impedes the aggregation of scattered knowledge. Recent studies have tackled the client drift issue by identifying significant divergence in the last classifier layer. To mitigate this divergence, strategies such as freezing the classifier weights and aligning the feature extractor accordingly have proven effective. Although the local alignment between classifier and feature extractor has been studied as a crucial factor in FL, we observe that it may lead the model to overemphasize the observed classes within each client. Thus, our objectives are twofold: (1) enhancing local alignment while (2) preserving the representation of unseen class samples. This approach aims to effectively integrate knowledge from individual clients, thereby improving performance for both global and personalized FL. To achieve this, we introduce a novel algorithm named FedDr+, which empowers local model alignment using dot-regression loss. FedDr+ freezes the classifier as a simplex ETF to align the features and improves aggregated global models by employing a feature distillation mechanism to retain information about unseen/missing classes. Consequently, we provide empirical evidence demonstrating that our algorithm surpasses existing methods that use a frozen classifier to boost alignment across the diverse distribution.
