FedRandom: Sampling Consistent and Accurate Contribution Values in Federated Learning
Arno Geimer, Beltran Fiz Pontiveros, Radu State
TL;DR
This paper tackles the problem of unstable Shapley-based contribution estimates in federated learning, which can undermine participation and trust. It proposes FedRandom, a sampling-based mitigation that randomizes aggregation strategy selection to generate a large number of contribution samples ($s^r$ with $s=|S|$ and rounds $r$), treating contributions as noisy estimates of a true underlying value. Empirical results on CIFAR-10/100, MNIST, and Fashion-MNIST with Dirichlet non-IID splits show that FedRandom substantially reduces variance and bias, improving alignment with a ground-truth size-based baseline in a majority of scenarios (e.g., reduction of $L_2$ and $L_\infty$ distances and a 92% win rate over MSM). While convergence is not dramatically improved, the method increases trust and fairness in cross-silo federations by providing more stable and reliable contribution evaluations, with linear overhead that is justified in incentive-aware deployments and future large-scale FL tasks such as LLM fine-tuning.
Abstract
Federated Learning is a privacy-preserving decentralized approach for Machine Learning tasks. In industry deployments characterized by a limited number of entities possessing abundant data, the significance of a participant's role in shaping the global model becomes pivotal given that participation in a federation incurs costs, and participants may expect compensation for their involvement. Additionally, the contributions of participants serve as a crucial means to identify and address potential malicious actors and free-riders. However, fairly assessing individual contributions remains a significant hurdle. Recent works have demonstrated a considerable inherent instability in contribution estimations across aggregation strategies. While employing a different strategy may offer convergence benefits, this instability can have potentially harming effects on the willingness of participants in engaging in the federation. In this work, we introduce FedRandom, a novel mitigation technique to the contribution instability problem. Tackling the instability as a statistical estimation problem, FedRandom allows us to generate more samples than when using regular FL strategies. We show that these additional samples provide a more consistent and reliable evaluation of participant contributions. We demonstrate our approach using different data distributions across CIFAR-10, MNIST, CIFAR-100 and FMNIST and show that FedRandom reduces the overall distance to the ground truth by more than a third in half of all evaluated scenarios, and improves stability in more than 90% of cases.
