Benchmarking Mutual Information-based Loss Functions in Federated Learning
Sarang S, Harsh D. Chothani, Qilei Li, Ahmed M. Abdelmoniem, Arnab K. Paul
TL;DR
Federated Learning faces fairness and performance gaps due to data heterogeneity and privacy constraints. The authors benchmark Mutual Information (MI)-based losses, rooted in variational MI bounds such as DV and NWJ, against Cross-Entropy baselines across FL strategies (FedAvg, Ditto, q-FedAvg) using CIFAR-10. They demonstrate that MI-based losses can reduce disparities among clients while preserving or enhancing overall performance, with losses like $I_{\text{ReJS}}$ and $I_{\text{ReInfoNCE}}$ showing strong gains, especially in Non-IID settings. Although MI-based losses incur computational overhead, the study provides a scalable, fairness-focused approach for FL and outlines directions for efficiency improvements and hybrid priors.
Abstract
Federated Learning (FL) has attracted considerable interest due to growing privacy concerns and regulations like the General Data Protection Regulation (GDPR), which stresses the importance of privacy-preserving and fair machine learning approaches. In FL, model training takes place on decentralized data, so as to allow clients to upload a locally trained model and receive a globally aggregated model without exposing sensitive information. However, challenges related to fairness-such as biases, uneven performance among clients, and the "free rider" issue complicates its adoption. In this paper, we examine the use of Mutual Information (MI)-based loss functions to address these concerns. MI has proven to be a powerful method for measuring dependencies between variables and optimizing deep learning models. By leveraging MI to extract essential features and minimize biases, we aim to improve both the fairness and effectiveness of FL systems. Through extensive benchmarking, we assess the impact of MI-based losses in reducing disparities among clients while enhancing the overall performance of FL.
