Unlocking the Potential of Model Calibration in Federated Learning
Yun-Wei Chu, Dong-Jun Han, Seyyedali Hosseinalipour, Christopher Brinton
TL;DR
This work addresses the overlooked issue of probability calibration in federated learning, where models can be miscalibrated under data heterogeneity and client privacy constraints. It introduces Non-Uniform Calibration for Federated Learning (NUCFL), a framework that injects a train-time calibration loss into local FL training and dynamically sets client-specific penalties based on the similarity between local and global models, using measures such as cosine similarity or Centered Kernel Alignment (CKA). By tying calibration penalties to how closely each client aligns with the global model, NUCFL improves calibration (lower ECE and SCE) without sacrificing accuracy and is compatible with a range of FL algorithms and calibration losses (DCA/MDCA). The extensive experiments demonstrate that NUCFL yields robust gains across datasets (MNIST, FEMNIST, CIFAR-10/100) and FL strategies, and it remains effective under different data heterogeneity levels and participation scenarios, highlighting its practical impact for trustworthy FL systems.
Abstract
Over the past several years, various federated learning (FL) methodologies have been developed to improve model accuracy, a primary performance metric in machine learning. However, to utilize FL in practical decision-making scenarios, beyond considering accuracy, the trained model must also have a reliable confidence in each of its predictions, an aspect that has been largely overlooked in existing FL research. Motivated by this gap, we propose Non-Uniform Calibration for Federated Learning (NUCFL), a generic framework that integrates FL with the concept of model calibration. The inherent data heterogeneity in FL environments makes model calibration particularly difficult, as it must ensure reliability across diverse data distributions and client conditions. Our NUCFL addresses this challenge by dynamically adjusting the model calibration objectives based on statistical relationships between each client's local model and the global model in FL. In particular, NUCFL assesses the similarity between local and global model relationships, and controls the penalty term for the calibration loss during client-side local training. By doing so, NUCFL effectively aligns calibration needs for the global model in heterogeneous FL settings while not sacrificing accuracy. Extensive experiments show that NUCFL offers flexibility and effectiveness across various FL algorithms, enhancing accuracy as well as model calibration.
