Table of Contents
Fetching ...

pFedFair: Towards Optimal Group Fairness-Accuracy Trade-off in Heterogeneous Federated Learning

Haoyu Lei, Shizhan Gong, Qi Dou, Farzan Farnia

TL;DR

The paper addresses the challenge of achieving group fairness in federated learning under client heterogeneity, where a single globally fair classifier can underperform. It introduces pFedFair, a personalized FL framework that preserves global utility while allowing each client to impose its own fairness constraints through a Moreau-envelope-based formulation, effectively coupling a shared model with client-specific fairness optimization. The authors provide theoretical insights showing the suboptimality of purely global fairness under heterogeneous distributions, and they demonstrate empirically that pFedFair yields superior client-level fairness-accuracy trade-offs on tabular and vision tasks, including CelebA, UTKFace, Adult, and COMPAS, with embeddings from DINOv2 and CLIP enabling scalable fairness in CV. These contributions offer a practical, scalable approach to deploying fair FL in real-world, non-IID settings and motivate extensions to more complex vision and multimodal tasks.

Abstract

Federated learning (FL) algorithms commonly aim to maximize clients' accuracy by training a model on their collective data. However, in several FL applications, the model's decisions should meet a group fairness constraint to be independent of sensitive attributes such as gender or race. While such group fairness constraints can be incorporated into the objective function of the FL optimization problem, in this work, we show that such an approach would lead to suboptimal classification accuracy in an FL setting with heterogeneous client distributions. To achieve an optimal accuracy-group fairness trade-off, we propose the Personalized Federated Learning for Client-Level Group Fairness (pFedFair) framework, where clients locally impose their fairness constraints over the distributed training process. Leveraging the image embedding models, we extend the application of pFedFair to computer vision settings, where we numerically show that pFedFair achieves an optimal group fairness-accuracy trade-off in heterogeneous FL settings. We present the results of several numerical experiments on benchmark and synthetic datasets, which highlight the suboptimality of non-personalized FL algorithms and the improvements made by the pFedFair method.

pFedFair: Towards Optimal Group Fairness-Accuracy Trade-off in Heterogeneous Federated Learning

TL;DR

The paper addresses the challenge of achieving group fairness in federated learning under client heterogeneity, where a single globally fair classifier can underperform. It introduces pFedFair, a personalized FL framework that preserves global utility while allowing each client to impose its own fairness constraints through a Moreau-envelope-based formulation, effectively coupling a shared model with client-specific fairness optimization. The authors provide theoretical insights showing the suboptimality of purely global fairness under heterogeneous distributions, and they demonstrate empirically that pFedFair yields superior client-level fairness-accuracy trade-offs on tabular and vision tasks, including CelebA, UTKFace, Adult, and COMPAS, with embeddings from DINOv2 and CLIP enabling scalable fairness in CV. These contributions offer a practical, scalable approach to deploying fair FL in real-world, non-IID settings and motivate extensions to more complex vision and multimodal tasks.

Abstract

Federated learning (FL) algorithms commonly aim to maximize clients' accuracy by training a model on their collective data. However, in several FL applications, the model's decisions should meet a group fairness constraint to be independent of sensitive attributes such as gender or race. While such group fairness constraints can be incorporated into the objective function of the FL optimization problem, in this work, we show that such an approach would lead to suboptimal classification accuracy in an FL setting with heterogeneous client distributions. To achieve an optimal accuracy-group fairness trade-off, we propose the Personalized Federated Learning for Client-Level Group Fairness (pFedFair) framework, where clients locally impose their fairness constraints over the distributed training process. Leveraging the image embedding models, we extend the application of pFedFair to computer vision settings, where we numerically show that pFedFair achieves an optimal group fairness-accuracy trade-off in heterogeneous FL settings. We present the results of several numerical experiments on benchmark and synthetic datasets, which highlight the suboptimality of non-personalized FL algorithms and the improvements made by the pFedFair method.

Paper Structure

This paper contains 19 sections, 2 theorems, 15 equations, 7 figures, 3 tables, 1 algorithm.

Key Result

Proposition 4.1

Consider $m$ clients in an FL task with $0/1$-loss, where $P^{(i)}(X,S,Y)$ represents the joint distribution of Client $i\in\{1,\ldots ,m\}$ and binary $S\in\{0,1\}$. Suppose the conditional distribution $P^{(i)}(Y|X,S)$ is shared across clients, where $Y= g(X,S)$ for a labeling scheme $g$.

Figures (7)

  • Figure 1: Illustration of client-level fairness-aware personalization in heterogeneous federated learning. The fairness-aware global model aggregation often favors majority sensitive groups due to client heterogeneity. Our proposed pFedFair enables each client to learn a personalized fair classifier, optimizing the client-level fairness-accuracy trade-off while using global model knowledge.
  • Figure 2: Overview of pFedFair framework in heterogeneous federated learning settings. The framework integrates a global model optimized for utility with fairness-aware personalization at each client to achieve client-level optimal fairness-accuracy trade-offs.
  • Figure 3: Experimental results comparing Negative Prediction Rates (NPR) of different clients for Fairness-aware FedAvg (Left) and pFedFair (Right) algorithms when applied to the four benchmarks. The fairness coefficient $\eta=0$ means an ERM setting with no fairness constraint, while $\eta=0.9$ is the strongest fairness regularization coefficient over the range $[0,1)$.
  • Figure 4: Experimental Results. (a) Test Error ($\downarrow$) vs. DDP ($\downarrow$) trade-off in centralized settings, compared to other fairness-aware visual recognition frameworks. (b) Test Error ($\downarrow$) vs. DDP ($\downarrow$) trade-off in federated learning settings, compared to other fairness-aware FL frameworks. (c) Effect of parameter $\lambda$ in balancing global model updates and local fairness-aware personalization for optimal trade-offs.
  • Figure 5: Experimental results of different baseline methods on Adult (top) and COMPAS (bottom). Each data point represents the client-level performance on Accuracy and DDP.
  • ...and 2 more figures

Theorems & Definitions (6)

  • Proposition 4.1
  • proof
  • Remark 4.2
  • Proposition 4.3
  • proof
  • Remark 4.4