Semi-Variance Reduction for Fair Federated Learning
Saber Malekmohammadi, Yaoliang Yu
TL;DR
This work tackles fairness in Federated Learning under non-IID client data by introducing two risk-aware algorithms, VRed and Semi-VRed, inspired by Mean-Variance and Mean-Semi-Variance in finance. VRed adds a variance penalty to the global objective, increasing emphasis on higher-loss clients, while Semi-VRed uses a one-sided semi-variance term to target downside risk without overly suppressing well-performing clients. The authors provide gradient-based aggregation rules, interpretative insights, and bounds linking to distributionally robust optimization, and validate the methods on vision and language benchmarks with heterogeneous data, showing Semi-VRed achieves state-of-the-art fairness and, in several settings, improved overall mean accuracy. This approach offers a principled, risk-driven mechanism to improve fairness in FL without sacrificing system performance, with practical implications for deploying fair and robust FL systems in real-world, heterogeneous environments.
Abstract
Ensuring fairness in a Federated Learning (FL) system, i.e., a satisfactory performance for all of the participating diverse clients, is an important and challenging problem. There are multiple fair FL algorithms in the literature, which have been relatively successful in providing fairness. However, these algorithms mostly emphasize on the loss functions of worst-off clients to improve their performance, which often results in the suppression of well-performing ones. As a consequence, they usually sacrifice the system's overall average performance for achieving fairness. Motivated by this and inspired by two well-known risk modeling methods in Finance, Mean-Variance and Mean-Semi-Variance, we propose and study two new fair FL algorithms, Variance Reduction (VRed) and Semi-Variance Reduction (SemiVRed). VRed encourages equality between clients' loss functions by penalizing their variance. In contrast, SemiVRed penalizes the discrepancy of only the worst-off clients' loss functions from the average loss. Through extensive experiments on multiple vision and language datasets, we show that, SemiVRed achieves SoTA performance in scenarios with heterogeneous data distributions and improves both fairness and system overall average performance.
