Semi-Variance Reduction for Fair Federated Learning

Saber Malekmohammadi; Yaoliang Yu

Semi-Variance Reduction for Fair Federated Learning

Saber Malekmohammadi, Yaoliang Yu

TL;DR

This work tackles fairness in Federated Learning under non-IID client data by introducing two risk-aware algorithms, VRed and Semi-VRed, inspired by Mean-Variance and Mean-Semi-Variance in finance. VRed adds a variance penalty to the global objective, increasing emphasis on higher-loss clients, while Semi-VRed uses a one-sided semi-variance term to target downside risk without overly suppressing well-performing clients. The authors provide gradient-based aggregation rules, interpretative insights, and bounds linking to distributionally robust optimization, and validate the methods on vision and language benchmarks with heterogeneous data, showing Semi-VRed achieves state-of-the-art fairness and, in several settings, improved overall mean accuracy. This approach offers a principled, risk-driven mechanism to improve fairness in FL without sacrificing system performance, with practical implications for deploying fair and robust FL systems in real-world, heterogeneous environments.

Abstract

Ensuring fairness in a Federated Learning (FL) system, i.e., a satisfactory performance for all of the participating diverse clients, is an important and challenging problem. There are multiple fair FL algorithms in the literature, which have been relatively successful in providing fairness. However, these algorithms mostly emphasize on the loss functions of worst-off clients to improve their performance, which often results in the suppression of well-performing ones. As a consequence, they usually sacrifice the system's overall average performance for achieving fairness. Motivated by this and inspired by two well-known risk modeling methods in Finance, Mean-Variance and Mean-Semi-Variance, we propose and study two new fair FL algorithms, Variance Reduction (VRed) and Semi-Variance Reduction (SemiVRed). VRed encourages equality between clients' loss functions by penalizing their variance. In contrast, SemiVRed penalizes the discrepancy of only the worst-off clients' loss functions from the average loss. Through extensive experiments on multiple vision and language datasets, we show that, SemiVRed achieves SoTA performance in scenarios with heterogeneous data distributions and improves both fairness and system overall average performance.

Semi-Variance Reduction for Fair Federated Learning

TL;DR

Abstract

Paper Structure (24 sections, 5 theorems, 34 equations, 2 figures, 6 tables, 1 algorithm)

This paper contains 24 sections, 5 theorems, 34 equations, 2 figures, 6 tables, 1 algorithm.

Introduction
Background
Algorithms based on generalized mean
Algorithms based on enforcing equality
Risk modeling methods in Finance: Mean-Variance and Mean-Semi-Variance
MV and MSV models for fair FL
The VRed algorithm
An interpretation of VRed
The Semi-VRed algorithm
Can we interpret what Semi-VRed does?
Optimization aspect
Scenarios with large label shifts
Experiments
Experimental setup
Comparison of VRed and Semi-VRed with other baseline algorithms
...and 9 more sections

Key Result

Lemma 1

Assuming equal dataset sizes for all clients, for any model parameter $\boldsymbol \theta$, the gradient of the global objective $F_{\texttt{VRed}\xspace}(\boldsymbol \theta )$ defined in eq:v-red can be expressed as

Figures (2)

Figure 1: Average and worst 10% test accuracies. top left: CIFAR-10, top right: CIFAR-100, bottom left: CINIC-10, bottom right: StackOverflow. Due to divergence on highly heterogeneous data, results for AFL on CIFAR-10 and StackOverFlow are not shown. All subfigures share the same legends and axis labels.
Figure 2: Worst 20% test accuracies for different algorithms. top left: CIFAR-10, top right: CIFAR-100, bottom left: CINIC-10, bottom right: StackOverflow. Due to divergence, results for AFL on CIFAR-10 and StackOverFlow are not shown. All subfigures share the same legends and axis labels.

Theorems & Definitions (10)

Lemma 1
Lemma 2
Remark 1
Lemma 3
proof
Example 1
Lemma 3
proof
Lemma 3
proof

Semi-Variance Reduction for Fair Federated Learning

TL;DR

Abstract

Semi-Variance Reduction for Fair Federated Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (10)