Enhancing the Convergence of Federated Learning Aggregation Strategies with Limited Data
Judith Sáinz-Pardo Díaz, Álvaro López García
TL;DR
The paper tackles privacy constraints in medical imaging by evaluating federated learning aggregation strategies for brain MRI classification with limited data, and by introducing FedAvgOpt as an optimized aggregation method. FedAvgOpt computes aggregated weights by solving a nonlinear problem to optimally weight client contributions, expressed as $w_{FedAvgOpt}^{r}=rac{1}{ extstyle extstyle extstyle extstyle extstyle extstyle extstyle} abla_i n_i w_i^{r} oldsymbol{\alpha}^{r}$ (with $oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol}}}}}}}}}}}$) solved via Nelder-Mead; when $oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{ oldsymbol{ oldsymbol{ oldsymbol{ oldsymbol{ oldsymbol{ oldsymbol{ oldsymbol{ }}}}}}}}}}}}}}=1_n$, it reduces to FedAvg. Across four CNN backbones (VGG16, InceptionV3, ResNet-50 V2, DenseNet121) deployed in a 4-class brain MRI task with distributed data, FedAvgOpt consistently outperformed FedAvg, FedAvgM, FedMedian, FedOpt, and FedYogi, delivering faster convergence and higher accuracy. This demonstrates a practical, privacy-preserving approach for collaborative medical imaging model development across data centers. The work suggests substantial impact for multi-institution collaboration under stringent data governance.
Abstract
The development of deep learning techniques is a leading field applied to cases in which medical data is used, particularly in cases of image diagnosis. This type of data has privacy and legal restrictions that in many cases prevent it from being processed from central servers. However, in this area collaboration between different research centers, in order to create models as robust as possible, trained with the largest quantity and diversity of data available, is a critical point to be taken into account. In this sense, the application of privacy aware distributed architectures, such as federated learning arises. When applying this type of architecture, the server aggregates the different local models trained with the data of each data owner to build a global model. This point is critical and therefore it is fundamental to analyze different ways of aggregation according to the use case, taking into account the distribution of the clients, the characteristics of the model, etc. In this paper we propose a novel aggregation strategy and we apply it to a use case of cerebral magnetic resonance image classification. In this use case the aggregation function proposed manages to improve the convergence obtained over the rounds of the federated learning process in relation to different aggregation strategies classically implemented and applied.
