Table of Contents
Fetching ...

Benchmarking Mutual Information-based Loss Functions in Federated Learning

Sarang S, Harsh D. Chothani, Qilei Li, Ahmed M. Abdelmoniem, Arnab K. Paul

TL;DR

Federated Learning faces fairness and performance gaps due to data heterogeneity and privacy constraints. The authors benchmark Mutual Information (MI)-based losses, rooted in variational MI bounds such as DV and NWJ, against Cross-Entropy baselines across FL strategies (FedAvg, Ditto, q-FedAvg) using CIFAR-10. They demonstrate that MI-based losses can reduce disparities among clients while preserving or enhancing overall performance, with losses like $I_{\text{ReJS}}$ and $I_{\text{ReInfoNCE}}$ showing strong gains, especially in Non-IID settings. Although MI-based losses incur computational overhead, the study provides a scalable, fairness-focused approach for FL and outlines directions for efficiency improvements and hybrid priors.

Abstract

Federated Learning (FL) has attracted considerable interest due to growing privacy concerns and regulations like the General Data Protection Regulation (GDPR), which stresses the importance of privacy-preserving and fair machine learning approaches. In FL, model training takes place on decentralized data, so as to allow clients to upload a locally trained model and receive a globally aggregated model without exposing sensitive information. However, challenges related to fairness-such as biases, uneven performance among clients, and the "free rider" issue complicates its adoption. In this paper, we examine the use of Mutual Information (MI)-based loss functions to address these concerns. MI has proven to be a powerful method for measuring dependencies between variables and optimizing deep learning models. By leveraging MI to extract essential features and minimize biases, we aim to improve both the fairness and effectiveness of FL systems. Through extensive benchmarking, we assess the impact of MI-based losses in reducing disparities among clients while enhancing the overall performance of FL.

Benchmarking Mutual Information-based Loss Functions in Federated Learning

TL;DR

Federated Learning faces fairness and performance gaps due to data heterogeneity and privacy constraints. The authors benchmark Mutual Information (MI)-based losses, rooted in variational MI bounds such as DV and NWJ, against Cross-Entropy baselines across FL strategies (FedAvg, Ditto, q-FedAvg) using CIFAR-10. They demonstrate that MI-based losses can reduce disparities among clients while preserving or enhancing overall performance, with losses like and showing strong gains, especially in Non-IID settings. Although MI-based losses incur computational overhead, the study provides a scalable, fairness-focused approach for FL and outlines directions for efficiency improvements and hybrid priors.

Abstract

Federated Learning (FL) has attracted considerable interest due to growing privacy concerns and regulations like the General Data Protection Regulation (GDPR), which stresses the importance of privacy-preserving and fair machine learning approaches. In FL, model training takes place on decentralized data, so as to allow clients to upload a locally trained model and receive a globally aggregated model without exposing sensitive information. However, challenges related to fairness-such as biases, uneven performance among clients, and the "free rider" issue complicates its adoption. In this paper, we examine the use of Mutual Information (MI)-based loss functions to address these concerns. MI has proven to be a powerful method for measuring dependencies between variables and optimizing deep learning models. By leveraging MI to extract essential features and minimize biases, we aim to improve both the fairness and effectiveness of FL systems. Through extensive benchmarking, we assess the impact of MI-based losses in reducing disparities among clients while enhancing the overall performance of FL.

Paper Structure

This paper contains 14 sections, 10 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Cross-Device FL in IID & Non-IID settings.
  • Figure 2: Cross-Silo FL in IID and Non-IID settings.
  • Figure 3: Comparison between Ditto cross-device IID experiments for CE and $I_\text{ReJS}$.
  • Figure 4: Comparison between FedAvg cross-device Non-IID experiments for CE and $I_\text{ReInfoNCE}.$