Table of Contents
Fetching ...

Addressing Data Quality Decompensation in Federated Learning via Dynamic Client Selection

Qinjun Fei, Nuria Rodríguez-Barroso, María Victoria Luzón, Zhongliang Zhang, Francisco Herrera

TL;DR

This paper addresses data quality decompensation in cross-silo Federated Learning by introducing SBRO-FL, a unified framework that combines dynamic bidding, prospect-theory–driven reputation, and budget-aware client selection. It leverages Shapley-value–based contribution evaluation to quantify each client's marginal impact ($sv_i^t$) and updates reputations through a risk-sensitive mechanism, all within a 0-1 integer programming formulation for selection. The approach jointly optimizes data reliability, incentive compatibility, and cost efficiency, demonstrating improved accuracy, faster convergence, and robustness against adversarial bidding and noisy data across four datasets. The work has practical implications for scalable, trustworthy FL deployments by balancing data quality, economic feasibility, and participation incentives.

Abstract

In cross-silo Federated Learning (FL), client selection is critical to ensure high model performance, yet it remains challenging due to data quality decompensation, budget constraints, and incentive compatibility. As training progresses, these factors exacerbate client heterogeneity and degrade global performance. Most existing approaches treat these challenges in isolation, making jointly optimizing multiple factors difficult. To address this, we propose Shapley-Bid Reputation Optimized Federated Learning (SBRO-FL), a unified framework integrating dynamic bidding, reputation modeling, and cost-aware selection. Clients submit bids based on their perceived data quality, and their contributions are evaluated using Shapley values to quantify their marginal impact on the global model. A reputation system, inspired by prospect theory, captures historical performance while penalizing inconsistency. The client selection problem is formulated as a 0-1 integer program that maximizes reputation-weighted utility under budget constraints. Experiments on FashionMNIST, EMNIST, CIFAR-10, and SVHN datasets show that SBRO-FL improves accuracy, convergence speed, and robustness, even in adversarial and low-bid interference scenarios. Our results highlight the importance of balancing data reliability, incentive compatibility, and cost efficiency to enable scalable and trustworthy FL deployments.

Addressing Data Quality Decompensation in Federated Learning via Dynamic Client Selection

TL;DR

This paper addresses data quality decompensation in cross-silo Federated Learning by introducing SBRO-FL, a unified framework that combines dynamic bidding, prospect-theory–driven reputation, and budget-aware client selection. It leverages Shapley-value–based contribution evaluation to quantify each client's marginal impact () and updates reputations through a risk-sensitive mechanism, all within a 0-1 integer programming formulation for selection. The approach jointly optimizes data reliability, incentive compatibility, and cost efficiency, demonstrating improved accuracy, faster convergence, and robustness against adversarial bidding and noisy data across four datasets. The work has practical implications for scalable, trustworthy FL deployments by balancing data quality, economic feasibility, and participation incentives.

Abstract

In cross-silo Federated Learning (FL), client selection is critical to ensure high model performance, yet it remains challenging due to data quality decompensation, budget constraints, and incentive compatibility. As training progresses, these factors exacerbate client heterogeneity and degrade global performance. Most existing approaches treat these challenges in isolation, making jointly optimizing multiple factors difficult. To address this, we propose Shapley-Bid Reputation Optimized Federated Learning (SBRO-FL), a unified framework integrating dynamic bidding, reputation modeling, and cost-aware selection. Clients submit bids based on their perceived data quality, and their contributions are evaluated using Shapley values to quantify their marginal impact on the global model. A reputation system, inspired by prospect theory, captures historical performance while penalizing inconsistency. The client selection problem is formulated as a 0-1 integer program that maximizes reputation-weighted utility under budget constraints. Experiments on FashionMNIST, EMNIST, CIFAR-10, and SVHN datasets show that SBRO-FL improves accuracy, convergence speed, and robustness, even in adversarial and low-bid interference scenarios. Our results highlight the importance of balancing data reliability, incentive compatibility, and cost efficiency to enable scalable and trustworthy FL deployments.

Paper Structure

This paper contains 22 sections, 9 equations, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: Federated learning example: client–server architecture.
  • Figure 2: Workflow of SBRO-FL within the traditional FL framework.
  • Figure 3: Prospect theory value function. The X-axis represents gains and losses relative to a reference point, while the Y-axis represents perceived value. The function is asymmetric: losses have a steeper curve than gains, reflecting loss aversion, meaning individuals perceive losses more strongly than equivalent gains. Conversely, gains exhibit diminishing sensitivity, meaning the perceived impact of additional gains decreases as they accumulate.
  • Figure 4: Trends in Model Accuracy: Comparing SBRO-FL and Baseline Methods Across Diverse Datasets.
  • Figure 5: Trends in Model Accuracy: Evaluating SBRO-FL and Baseline Methods Under Low-Cost Interference Across Datasets.

Theorems & Definitions (4)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4