Table of Contents
Fetching ...

Optimizing Federated Learning by Entropy-Based Client Selection

Andreas Lutz, Gabriele Steidl, Karsten Müller, Wojciech Samek

TL;DR

FedEntOpt addresses label skew in federated learning by greedily selecting clients to maximize the entropy of the aggregated label distribution, effectively approximating the global marginal $P_Y(y) = \sum_{s\in S} \frac{n_s}{\sum_j n_j} P_{Y^{(s)}}(y)$. The server collects per-client label-count vectors $l^{(k)}$, builds a running aggregate $L$, and adds the next client by maximizing $H((L + l^{(m)}) / ||L + l^{(m)}||_1)$, using a FIFO buffer to maintain diversity. Across CIFAR-style and medical datasets, FedEntOpt yields up to about 6% gains in standard settings and over 30% gains in low-participation scenarios, remains robust under differential privacy with $\epsilon = 0.5$, and provides additive improvements when combined with other baselines. The approach offers a lightweight, privacy-preserving enhancement to FL with practical applicability to real-world deployments.

Abstract

Although deep learning has revolutionized domains such as natural language processing and computer vision, its dependence on centralized datasets raises serious privacy concerns. Federated learning addresses this issue by enabling multiple clients to collaboratively train a global deep learning model without compromising their data privacy. However, the performance of such a model degrades under label skew, where the label distribution differs between clients. To overcome this issue, a novel method called FedEntOpt is proposed. In each round, it selects clients to maximize the entropy of the aggregated label distribution, ensuring that the global model is exposed to data from all available classes. Extensive experiments on multiple benchmark datasets show that the proposed method outperforms several state-of-the-art algorithms by up to 6% in classification accuracy under standard settings regardless of the model size, while achieving gains of over 30% in scenarios with low participation rates and client dropout. In addition, FedEntOpt offers the flexibility to be combined with existing algorithms, enhancing their classification accuracy by more than 40%. Importantly, its performance remains unaffected even when differential privacy is applied.

Optimizing Federated Learning by Entropy-Based Client Selection

TL;DR

FedEntOpt addresses label skew in federated learning by greedily selecting clients to maximize the entropy of the aggregated label distribution, effectively approximating the global marginal . The server collects per-client label-count vectors , builds a running aggregate , and adds the next client by maximizing , using a FIFO buffer to maintain diversity. Across CIFAR-style and medical datasets, FedEntOpt yields up to about 6% gains in standard settings and over 30% gains in low-participation scenarios, remains robust under differential privacy with , and provides additive improvements when combined with other baselines. The approach offers a lightweight, privacy-preserving enhancement to FL with practical applicability to real-world deployments.

Abstract

Although deep learning has revolutionized domains such as natural language processing and computer vision, its dependence on centralized datasets raises serious privacy concerns. Federated learning addresses this issue by enabling multiple clients to collaboratively train a global deep learning model without compromising their data privacy. However, the performance of such a model degrades under label skew, where the label distribution differs between clients. To overcome this issue, a novel method called FedEntOpt is proposed. In each round, it selects clients to maximize the entropy of the aggregated label distribution, ensuring that the global model is exposed to data from all available classes. Extensive experiments on multiple benchmark datasets show that the proposed method outperforms several state-of-the-art algorithms by up to 6% in classification accuracy under standard settings regardless of the model size, while achieving gains of over 30% in scenarios with low participation rates and client dropout. In addition, FedEntOpt offers the flexibility to be combined with existing algorithms, enhancing their classification accuracy by more than 40%. Importantly, its performance remains unaffected even when differential privacy is applied.

Paper Structure

This paper contains 26 sections, 13 equations, 2 figures, 6 tables, 1 algorithm.

Figures (2)

  • Figure 1: Entropy of the combined label distribution over selected client subsets in each communication round.
  • Figure 2: Mean test accuracy (%) over 10 final rounds of varying participation rates for $\mathrm{Dir}(0.1)$ and $C=2$ on CIFAR-10 and CIFAR-100, comparing baselines with FedEntOpt. Results for VGG-11 are shown above and for LeNet-5 below.

Theorems & Definitions (1)

  • Definition 1