FBFL: A Field-Based Coordination Approach for Data Heterogeneity in Federated Learning
Davide Domini, Gianluca Aguzzi, Lukas Esterle, Mirko Viroli
TL;DR
FBFL tackles data heterogeneity and resilience in federated learning by introducing field-based coordination to create self-organizing, spatially localized learning regions with distributed leaders. The approach formalizes regional objectives and leverages SCR patterns to enable dynamic leader election and hierarchical aggregation, avoiding a central server. Empirical results on MNIST, FashionMNIST, and Extended MNIST show that FBFL matches FedAvg under IID data and significantly outperforms FedAvg, FedProx, and Scaffold under non-IID conditions, while demonstrating resilience to aggregator failures. The work advances decentralized FL by combining personalized regional models with robust, self-stabilizing coordination, offering scalable and privacy-preserving learning suitable for edge and IoT deployments.
Abstract
In the last years, Federated learning (FL) has become a popular solution to train machine learning models in domains with high privacy concerns. However, FL scalability and performance face significant challenges in real-world deployments where data across devices are non-independently and identically distributed (non-IID). The heterogeneity in data distribution frequently arises from spatial distribution of devices, leading to degraded model performance in the absence of proper handling. Additionally, FL typical reliance on centralized architectures introduces bottlenecks and single-point-of-failure risks, particularly problematic at scale or in dynamic environments. To close this gap, we propose Field-Based Federated Learning (FBFL), a novel approach leveraging macroprogramming and field coordination to address these limitations through: (i) distributed spatial-based leader election for personalization to mitigate non-IID data challenges; and (ii) construction of a self-organizing, hierarchical architecture using advanced macroprogramming patterns. Moreover, FBFL not only overcomes the aforementioned limitations, but also enables the development of more specialized models tailored to the specific data distribution in each subregion. This paper formalizes FBFL and evaluates it extensively using MNIST, FashionMNIST, and Extended MNIST datasets. We demonstrate that, when operating under IID data conditions, FBFL performs comparably to the widely-used FedAvg algorithm. Furthermore, in challenging non-IID scenarios, FBFL not only outperforms FedAvg but also surpasses other state-of-the-art methods, namely FedProx and Scaffold, which have been specifically designed to address non-IID data distributions. Additionally, we showcase the resilience of FBFL's self-organizing hierarchical architecture against server failures.
