Table of Contents
Fetching ...

Loss Gap Parity for Fairness in Heterogeneous Federated Learning

Brahim Erraji, Michaël Perrot, Aurélien Bellet

Abstract

While clients may join federated learning to improve performance on data they rarely observe locally, they often remain self-interested, expecting the global model to perform well on their own data. This motivates an objective that ensures all clients achieve a similar loss gap -the difference in performance between the global model and the best model they could train using only their local data-. To this end, we propose EAGLE, a novel federated learning algorithm that explicitly regularizes the global model to minimize disparities in loss gaps across clients. Our approach is particularly effective in heterogeneous settings, where the optimal local models of the clients may be misaligned. Unlike existing methods that encourage loss parity, potentially degrading performance for many clients, EAGLE targets fairness in relative improvements. We provide theoretical convergence guarantees for EAGLE under non-convex loss functions, and characterize how its iterates perform relative to the standard federated learning objective using a novel heterogeneity measure. Empirically, we demonstrate that EAGLE reduces the disparity in loss gaps among clients by prioritizing those furthest from their local optimal loss, while maintaining competitive utility in both convex and non-convex cases compared to strong baselines.

Loss Gap Parity for Fairness in Heterogeneous Federated Learning

Abstract

While clients may join federated learning to improve performance on data they rarely observe locally, they often remain self-interested, expecting the global model to perform well on their own data. This motivates an objective that ensures all clients achieve a similar loss gap -the difference in performance between the global model and the best model they could train using only their local data-. To this end, we propose EAGLE, a novel federated learning algorithm that explicitly regularizes the global model to minimize disparities in loss gaps across clients. Our approach is particularly effective in heterogeneous settings, where the optimal local models of the clients may be misaligned. Unlike existing methods that encourage loss parity, potentially degrading performance for many clients, EAGLE targets fairness in relative improvements. We provide theoretical convergence guarantees for EAGLE under non-convex loss functions, and characterize how its iterates perform relative to the standard federated learning objective using a novel heterogeneity measure. Empirically, we demonstrate that EAGLE reduces the disparity in loss gaps among clients by prioritizing those furthest from their local optimal loss, while maintaining competitive utility in both convex and non-convex cases compared to strong baselines.

Paper Structure

This paper contains 25 sections, 8 theorems, 38 equations, 8 figures, 11 tables, 1 algorithm.

Key Result

Theorem 1

Let $\theta_k^{(t, \tau)}$ refer to the model of client $k$ after $t$ communication rounds and $\tau$ local steps and let $\bar{\theta}^{(t, \tau)} := \frac{1}{K} \sum_{k=1}^K \theta_k^{(t, \tau)}$. Let $T$ be the total number of communication rounds and $I$ be the number of local steps between each with $F^{*} := \arg\min\limits_{\theta \in \mathcal{H}} F(\theta), \xi_1 := 2 B^2 (1 + 4\lambda \

Figures (8)

  • Figure 1: In this example, we use the AFL algorithm to enforce loss parity between the two clients, and our approach, EAGLE, to enforce loss gap parity. Enforcing loss parity favors $\text{client}~1$, which has noisier data and a more complex prediction task, resulting in poorer performance for $\text{client}~0$. In contrast, optimizing for loss gap parity handles this imbalance more effectively.
  • Figure 3: Local losses and local gaps for EAGLE($\lambda = 2.0$), q-FFL ($q = 5.0$) and AFL. The optimal local loss of $\text{client}~0$ is nearly $0$ by design and thus only its loss gap is visible in the plot. By balancing the losses across clients, q-FFL and AFL prioritize learning a good classifier for clients 1 and 2, which harms the performance of $\text{client}~0$. In contrast, by balancing the loss gaps across clients, EAGLE allows all clients to benefit equally from federated learning.
  • Figure 4: The distribution of loss gaps for the $10$ clients for a heterogeneous split ($\alpha=0.5)$ with CNN model. We notice that in the case of high non-iid data, EAGLE reduces the gap of the worst performing client in terms of loss and achieves less variance then other baselines.
  • Figure 5: Synthetic data distributions of the three clients. Rotating data resulted in $\text{client}~1$ and $\text{client}~2$ sharing the same optimal separator which is different than the one for $\text{client}~0$. This difference resuls in conflicting gradient directions between local optimizers for different clients.
  • Figure 6: Training behavior on the synthetic dataset. AFL and q-FFL increase the training loss of $\text{client}~0$ to match that of $\text{client}~2$, but fail to balance the losses of $\text{client}~1$ and $\text{client}~2$, which share the same optimal model despite differing in data separability.
  • ...and 3 more figures

Theorems & Definitions (15)

  • Definition 1: $\varepsilon$-Loss Parity
  • Definition 2: $\varepsilon$-Loss Gap Parity
  • Theorem 1: Convergence to a solution of \ref{['eq:regularized-obj']}
  • Theorem 2: Convergence to a neighborhood of the standard federated objective
  • Lemma 1: Properties of the weights of clients $w_k(\theta)$
  • proof
  • Lemma 2: Bounded norm of the gradient of $\tilde{F}_k$
  • proof
  • Lemma 3
  • proof
  • ...and 5 more